Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

OpenAI launches o3-mini, its latest 'reasoning' model | TechCrunch

Jan 31, 2025 - techcrunch.com
OpenAI has launched a new AI reasoning model, o3-mini, as part of its o family of models, focusing on improving accessibility to advanced AI. The model is designed for STEM problems, particularly in programming, math, and science, and is positioned as both powerful and affordable. It is reportedly faster and cheaper than its predecessors, with external testers preferring its answers over those from the o1-mini model. O3-mini is available to all ChatGPT users, with premium users receiving higher query limits, and will soon be accessible via OpenAI's API for select developers. The model allows users to adjust the level of reasoning effort, offering a balance between speed and accuracy.

Despite not being the most powerful model, o3-mini shows competitive performance against rivals like DeepSeek's R1, especially with high reasoning effort. It excels in certain benchmarks, such as AIME 2024 and SWE-bench Verified, but lags behind in others like GPQA Diamond. OpenAI emphasizes the model's cost-effectiveness and safety, claiming it surpasses previous models in safety evaluations. The release of o3-mini is part of OpenAI's broader mission to advance cost-effective intelligence while addressing challenges in the AI landscape.

Key takeaways:

  • OpenAI launched o3-mini, a new AI reasoning model, which is positioned as both powerful and affordable, and is aimed at broadening accessibility to advanced AI.
  • O3-mini is fine-tuned for STEM problems and is claimed to be more reliable than previous models, with external testers preferring its answers over those from o1-mini more than half the time.
  • The model is available via ChatGPT and OpenAI's API, with pricing set at $1.10 per million cached input tokens and $4.40 per million output tokens, making it 63% cheaper than o1-mini.
  • O3-mini is not the most powerful model and does not surpass DeepSeek's R1 in every benchmark, but it offers competitive performance at a lower cost and latency, especially with high reasoning effort.
View Full Article

Comments (0)

Be the first to comment!