Exclusive: Google Gemini is using Claude to improve its AI

Contractors working on Google's Gemini AI are evaluating its performance by comparing its responses to those of Anthropic's AI model, Claude. This process involves scoring the accuracy of each model's output based on criteria like truthfulness and verbosity, with contractors given up to 30 minutes per prompt. Internal communications reveal that Claude's responses often emphasize safety more than Gemini's, with Claude sometimes refusing to answer prompts it deems unsafe. Concerns have been raised about Gemini generating inaccurate information on sensitive topics, as contractors are asked to rate responses outside their expertise.

Google, a major investor in Anthropic, has not confirmed whether it obtained permission to use Claude for these evaluations. While Google DeepMind spokesperson Shira McNamara stated that comparing model outputs is standard practice, she denied that Gemini is trained on Anthropic models. Anthropic's terms of service prohibit using Claude to build or train competing AI models without approval. TechCrunch's report highlights the competitive nature of AI development and the ethical considerations involved in evaluating and improving AI models.

Key takeaways:

Contractors working on Google's Gemini AI are comparing its outputs against Anthropic's Claude without confirmed permission from Anthropic.
Contractors evaluate AI responses based on criteria like truthfulness and verbosity, with up to 30 minutes per prompt.
Claude's responses are noted for emphasizing safety, sometimes refusing to answer prompts deemed unsafe.
Google claims it compares model outputs for evaluation but denies using Anthropic models to train Gemini.

Exclusive: Google Gemini is using Claude to improve its AI

Key takeaways:

Comments (0)

Newsletter