Nvidia's success is attributed to software improvements, with the company logging a 27% improvement from the June 2023 MLPerf benchmarks for GPT-3 training. The company also implemented a scheme called flash attention, an algorithm that speeds transformer networks by minimizing writes to memory, which shaved as much as 10% from training times. However, upcoming training rounds in 2025 may see head-to-head contests comparing new accelerators from AMD, Intel, and Nvidia.
Key takeaways:
- Nvidia continues to dominate machine learning benchmarks, including the new tests released by MLPerf, which focus on fine-tuning of large language models and graph neural networks.
- Despite using the same Hopper architecture as last year, Nvidia managed to boost training times due to software improvements, achieving a 27 percent improvement from the June 2023 MLPerf benchmarks.
- MLPerf added new benchmarks this year, including fine-tuning and graph neural networks, to stay relevant to the AI industry's developments.
- Future training rounds in 2025 may see competitions between new accelerators from AMD, Intel, and Nvidia, with Nvidia planning to unveil a new architecture, Blackwell, later this year.