The article also presents a separate leaderboard for code generation, with Claude 3 Opus, Claude 3 Haiku, and Claude 3 Sonnet leading. A comparison of proprietary models' cost and context window is also provided, with Claude 3 Haiku having the lowest cost per 1M tokens. The data for these comparisons is sourced from the models' technical reports and public product/pricing pages.
Key takeaways:
- The leaderboard compares various commercial and open-source Language Learning Models (LLMs) based on their capabilities, price, and context window.
- The models with the largest context windows are Claude 2.1, GPT-4 Turbo, and Gemini Pro 1.5. The models with the lowest input cost per 1M tokens are Gemini Pro, Mistral Tiny, and GPT 3.5 Turbo.
- The comparison includes various benchmarks such as math, science, reasoning, and coding. Claude 3 Opus, Gemini 1.5 Pro, and GPT-4 are among the top performers in these benchmarks.
- The data used for the leaderboard is sourced from the technical reports of the models and public product/pricing pages.