DeepSeek may have used Google's Gemini to train its latest model

Last week, Chinese lab DeepSeek released an updated version of its R1 reasoning AI model, R1-0528, which performs well on math and coding benchmarks. However, there is speculation that DeepSeek may have trained this model using outputs from Google’s Gemini AI, as suggested by similarities in language and expressions noted by developers like Sam Paech and the creator of SpeechMap. DeepSeek has faced similar accusations before, with evidence suggesting it may have used distillation techniques on OpenAI’s ChatGPT data, a practice prohibited by OpenAI’s terms of service.

In response to concerns about distillation, AI companies like OpenAI and Google are enhancing security measures. OpenAI now requires ID verification for access to advanced models, excluding countries like China, while Google and Anthropic are summarizing model traces to prevent competitors from training on them. Despite these measures, experts like Nathan Lambert believe that DeepSeek might still use synthetic data from top API models like Gemini due to its resource constraints.

Key takeaways:

DeepSeek released an updated version of its R1 reasoning AI model, which some speculate may have been trained on outputs from Google's Gemini AI.
There are accusations that DeepSeek has previously used data from rival AI models, including OpenAI's ChatGPT, through a technique called distillation.
AI experts suggest that DeepSeek might be using synthetic data from top API models due to their limited GPU resources and ample financial resources.
AI companies like OpenAI and Google are implementing security measures to prevent distillation and protect their competitive advantages.

DeepSeek may have used Google's Gemini to train its latest model | TechCrunch

Key takeaways:

Comments (0)

Newsletter