VERSES® Genius™ Outperforms DeepSeek R1 Model in Code-Breaking "Mastermind" Challenge

VERSES AI Inc. announced that its flagship product, Genius, outperformed China's DeepSeek R1 model in a code-breaking "Mastermind" challenge. Genius demonstrated superior performance by solving the code 100% of the time, compared to DeepSeek's 45% success rate. It was also significantly faster and more cost-efficient, completing 100 games in just over 5 minutes at a cost of $0.05, whereas DeepSeek took 26 hours and cost $38.94. The challenge highlighted Genius's advanced multi-step reasoning capabilities, which are crucial for dynamic real-world applications.

VERSES positions Genius as a domain-specific model that enhances the capabilities of large language models (LLMs) by providing more accurate and reliable AI agents. The company believes that Genius's ability to perform multi-step reasoning and adapt dynamically to feedback is essential for AI adoption across various industries. VERSES aims to address the "last mile" challenge of AI accuracy, which it sees as key to unlocking broader market adoption.

Key takeaways

VERSES' Genius AI significantly outperformed China's DeepSeek R1 model in the Mastermind code-breaking challenge, achieving a 100% success rate compared to R1's 45%.
Genius demonstrated superior speed and cost-efficiency, solving games 245 times faster and 779 times cheaper than DeepSeek R1.
The challenge highlighted Genius' advanced multi-step reasoning capabilities, leveraging a Bayesian approach and Active Inference for dynamic adaptation.
VERSES positions Genius as a complementary tool to large language models, enhancing AI agents' accuracy and reliability for real-world applications.

VERSES® Genius™ Outperforms DeepSeek R1 Model in Code-Breaking "Mastermind" Challenge

Key takeaways

Discussion (0)