Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch

May 29, 2025 - techcrunch.com
DeepSeek has released a smaller, distilled version of its updated R1 reasoning AI model, named DeepSeek-R1-0528-Qwen3-8B. This model, built on Alibaba's Qwen3-8B, outperforms similar-sized models like Google's Gemini 2.5 Flash on the AIME 2025 math benchmark and nearly matches Microsoft's Phi 4 on the HMMT math skills test. While distilled models are generally less capable than their full-sized counterparts, they require significantly less computational power. DeepSeek-R1-0528-Qwen3-8B can run on a single GPU with 40GB-80GB of RAM, unlike the full-sized R1, which needs around a dozen 80GB GPUs.

DeepSeek trained this model by fine-tuning Qwen3-8B with text generated by the updated R1. The model is available on the AI dev platform Hugging Face and is intended for both academic research and industrial development of small-scale models. It is released under a permissive MIT license, allowing unrestricted commercial use. Several platforms, including LM Studio, offer the model through an API.

Key takeaways:

  • DeepSeek released a smaller, distilled version of its R1 model, called DeepSeek-R1-0528-Qwen3-8B, which outperforms comparably-sized models on certain benchmarks.
  • The distilled model performs better than Google's Gemini 2.5 Flash on the AIME 2025 math test and nearly matches Microsoft's Phi 4 reasoning plus model on the HMMT test.
  • DeepSeek-R1-0528-Qwen3-8B is less computationally demanding than the full-sized R1 model, requiring a single GPU with 40GB-80GB of RAM.
  • The model is available under a permissive MIT license and can be accessed through various hosts, including LM Studio, via an API.
View Full Article

Comments (0)

Be the first to comment!