We tested Anthropic's new chatbot -- and came away a bit disappointed

Anthropic, an AI startup backed by Google and Amazon, has released a family of models, Claude 3, which it claims outperforms OpenAI’s GPT-4 on various benchmarks. TechCrunch tested the most capable of these models, Claude 3 Opus, on a range of questions, from politics to healthcare, to assess its performance. The model, available on the web in a chatbot interface with a subscription to Anthropic’s Claude Pro plan and through Anthropic’s API, as well as through Amazon’s Bedrock and Google’s Vertex AI dev platforms, is a multimodal model trained on public and proprietary text and image data dated before August 2023.

The testing found that while Opus is among the more helpful chatbots, providing succinct and actionable answers, it struggles with questions relating to specific events that occurred within the last year. This is despite the fact that its training set cut-off is supposedly August 2023. Additionally, Opus lacks third-party app and service integrations, limiting its capabilities. For the cost of Anthropic’s Claude Pro plan ($20 per month), the same price as OpenAI’s and Google’s premium chatbot plans, its performance was deemed a bit underwhelming.

Key takeaways:

Anthropic's AI model, Claude 3 Opus, claims to outperform OpenAI’s GPT-4 on various benchmarks, but struggles with current and recent historical events due to lack of internet access.
Claude 3 Opus is a multimodal model trained on public and proprietary text and image data up to August 2023, and supports a 200,000-token context window, equivalent to about 150,000 words.
Despite its limitations, Opus is considered one of the more helpful chatbots, providing succinct, jargon-free and actionable answers.
However, Opus lacks third-party app and service integrations, limiting its capabilities compared to other chatbots like Gemini Ultra and ChatGPT.

We tested Anthropic's new chatbot -- and came away a bit disappointed | TechCrunch

Key takeaways:

Comments (0)

Newsletter