Ai2 says its new AI model beats one of DeepSeek's best

Ai2, a nonprofit AI research institute in Seattle, has released a new AI model called Tulu3-405B, which reportedly outperforms DeepSeek V3, a leading system from Chinese AI company DeepSeek, as well as OpenAI's GPT-4o on certain benchmarks. Tulu3-405B is open source, allowing for replication from scratch with freely available components. Ai2 emphasizes that this development highlights the U.S.'s potential to lead in the global development of competitive, open-source AI models. The model contains 405 billion parameters and was trained using 256 GPUs in parallel, utilizing a technique called reinforcement learning with verifiable rewards (RLVR) to achieve its performance.

Tulu3-405B excelled in benchmarks such as PopQA, outperforming DeepSeek V3, GPT-4o, and Meta’s Llama 3.1 405B model, and achieved the highest performance in its class on GSM8K, a test of grade school-level math word problems. The model is accessible for testing via Ai2’s chatbot web app, and the training code is available on GitHub and Hugging Face. Ai2's spokesperson suggests that this model marks a pivotal moment in AI development, showcasing the U.S.'s ability to lead with competitive, open-source AI independent of major tech companies.

Key takeaways:

Ai2 released Tulu3-405B, an open-source AI model that outperforms DeepSeek V3 and GPT-4o on certain benchmarks.
Tulu3-405B contains 405 billion parameters and required 256 GPUs to train, showcasing its complexity and power.
The model uses reinforcement learning with verifiable rewards (RLVR) to achieve competitive performance on tasks with verifiable outcomes.
Tulu3-405B is available for testing via Ai2’s chatbot web app, and its code is accessible on GitHub and Hugging Face.

Ai2 says its new AI model beats one of DeepSeek's best | TechCrunch

Key takeaways:

Comments (0)

Newsletter