In benchmark evaluations, the Hunyuan-Large pre-trained model outperformed both Dense and MoE based competitors in overall performance. It showed superior performance in commonsense understanding and reasoning, and classical NLP tasks such as QA and reading comprehension tasks. The model also excelled in mathematics capability, outperforming all baselines in math datasets. The Hunyuan-Large-Instruct model also demonstrated significant improvements on most types of tasks compared to LLMs with similar activated parameters, indicating the effectiveness of post-training.
Key takeaways:
- The Hunyuan-Large (Hunyuan-MoE-A52B) model is the largest open-source Transformer-based Mixture of Experts (MoE) model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters.
- The model has been optimized to handle long-context inputs, reduce memory usage and computational overhead, and ensure each sub-model effectively learns from the data and contributes to overall performance.
- Hunyuan-Large outperforms all baselines in math datasets of GSM8K and MATH, and also gains the best results on CMATH in Chinese. It also achieves the overall best performance in all Chinese tasks.
- The Hunyuan-Large-Instruct model demonstrates superior understanding and reasoning capabilities across a wide array of language understanding tasks and achieves the best performance on the MMLU and MATH dataset.