Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

Smarter than GPT-4: Claude 3 AI catches researchers testing it

Mar 06, 2024 - newatlas.com
Anthropic, a company founded by former OpenAI team members, has announced its latest AI model, Claude 3, which it claims has surpassed GPT-4 and Google's Gemini 1.0 model on a range of multimodal tests. The Claude 3 models are designed with a 200,000-token context window and can generate near-instant responses to inputs exceeding a million tokens. The AI is also less likely to refuse to answer questions close to safety and decency guardrails and is designed for business users, excelling at following complex instructions and adhering to brand voice and response guidelines.

Claude 3 has set new industry records in benchmark tests, particularly in zero-shot math abilities and the HumanEval coding test. However, there's currently no equivalent benchmark data on Google's Gemini 1.5 and OpenAI's GPT-4 Turbo models, so these models may still have an advantage in real-world applications. The AI's capabilities have raised questions about self-awareness in AI, as Claude 3 has shown signs of recognizing when it's being tested.

Key takeaways:

  • Anthropic, a company founded by former OpenAI team members, has announced Claude 3, an AI model that has surpassed GPT-4 and Google's Gemini 1.0 model on a range of multimodal tests.
  • The Claude 3 models can generate nearly-instant responses given inputs exceeding a million tokens, and are less likely to refuse to answer questions deemed close to the guardrails of safety and decency.
  • The AI is designed with a heavy slant toward business users, with strong visual capabilities and adeptness at following complex instructions and adhering to brand voice and response guidelines.
  • Despite Claude 3's impressive performance on benchmark tests, it's noted that Google's Gemini 1.5 and OpenAI's GPT-4 Turbo models aren't represented in the data, suggesting they may still hold an advantage in real-world applications.
View Full Article

Comments (0)

Be the first to comment!