The article also highlights the models' enhanced capabilities in coding and mathematics, long context understanding, and safety and responsibility. It mentions that the licenses for the models have been changed, with all models except Qwen2-72B now adopting Apache 2.0. The article concludes by stating that larger Qwen2 models are being trained, and that the Qwen2 language models are being extended to understand both vision and audio information.
Key takeaways:
- The Qwen2 series includes pretrained and instruction-tuned models of 5 sizes, including Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B, Qwen2-57B-A14B, and Qwen2-72B, which have been trained on data in 27 additional languages besides English and Chinese.
- Qwen2 models have significantly improved performance in coding and mathematics, and can support extended context length up to 128K tokens with Qwen2-7B-Instruct and Qwen2-72B-Instruct.
- Qwen2-72B-Instruct model performs comparably to GPT-4 in terms of safety, and significantly outperforms the Mistral-8x22B model in handling multilingual prompts.
- All models have been released in Hugging Face and ModelScope, and the licenses of the models have been changed to Apache 2.0 to enhance their openness to the community.