Researchers created an open rival to OpenAI's o1 'reasoning' model for under $50

AI researchers at Stanford and the University of Washington have developed a reasoning model called s1 for under $50 in cloud compute credits. The model, which performs comparably to advanced models like OpenAI’s o1 and DeepSeek’s R1, was created by fine-tuning an off-the-shelf base model through distillation, using Google’s Gemini 2.0 Flash Thinking Experimental as a reference. The s1 model, along with its training data and code, is available on GitHub. The researchers used a dataset of 1,000 curated questions and answers, and the training process took less than 30 minutes on 16 Nvidia H100 GPUs. This approach raises questions about the commoditization of AI models, as it demonstrates that significant AI capabilities can be replicated with minimal resources.

The s1 model highlights the potential of distillation and supervised fine-tuning (SFT) as cost-effective methods for developing AI models, contrasting with the more expensive reinforcement learning techniques used by others like DeepSeek. Despite the success of s1, the commoditization of AI models poses challenges for large AI labs, which are investing heavily in AI infrastructure. The researchers also introduced a technique to improve s1’s accuracy by instructing it to "wait" during reasoning. While distillation offers a way to replicate existing models affordably, it does not necessarily lead to the creation of groundbreaking new models.

Key takeaways:

AI researchers at Stanford and the University of Washington trained an AI reasoning model, s1, for under $50 using cloud compute credits, achieving performance similar to leading models.
The s1 model was fine-tuned through distillation from Google's Gemini 2.0 Flash Thinking Experimental, using a small dataset of 1,000 curated questions.
Distillation allows for the replication of expensive AI models at a fraction of the cost, raising concerns about the commoditization of AI models and competitive advantages.
The researchers used a technique to improve s1's accuracy by instructing it to "wait" during reasoning, demonstrating a simple yet effective method to enhance AI performance.

Researchers created an open rival to OpenAI's o1 'reasoning' model for under $50 | TechCrunch

Key takeaways:

Comments (0)

Newsletter