Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

Hello Qwen2

Jun 06, 2024 - qwenlm.github.io
The article announces the evolution from Qwen1.5 to Qwen2, a series of pretrained and instruction-tuned models of five sizes, trained on data in 27 additional languages besides English and Chinese. The models, which include Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B, Qwen2-57B-A14B, and Qwen2-72B, have been open-sourced in Hugging Face and ModelScope. They offer state-of-the-art performance in numerous benchmark evaluations, improved performance in coding and mathematics, and extended context length support up to 128K tokens.

The article also highlights the models' enhanced capabilities in coding and mathematics, long context understanding, and safety and responsibility. It mentions that the licenses for the models have been changed, with all models except Qwen2-72B now adopting Apache 2.0. The article concludes by stating that larger Qwen2 models are being trained, and that the Qwen2 language models are being extended to understand both vision and audio information.

Key takeaways:

  • The Qwen2 series includes pretrained and instruction-tuned models of 5 sizes, including Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B, Qwen2-57B-A14B, and Qwen2-72B, which have been trained on data in 27 additional languages besides English and Chinese.
  • Qwen2 models have significantly improved performance in coding and mathematics, and can support extended context length up to 128K tokens with Qwen2-7B-Instruct and Qwen2-72B-Instruct.
  • Qwen2-72B-Instruct model performs comparably to GPT-4 in terms of safety, and significantly outperforms the Mistral-8x22B model in handling multilingual prompts.
  • All models have been released in Hugging Face and ModelScope, and the licenses of the models have been changed to Apache 2.0 to enhance their openness to the community.
View Full Article

Comments (0)

Be the first to comment!