VERSES Genius™ Outperforms OpenAI Model in Code-Breaking Challenge, "Mastermind"

VERSES AI Inc. announced that its cognitive computing product, Genius, outperformed OpenAI's o1-preview model in a code-breaking challenge using the game Mastermind. Over 100 test runs, Genius demonstrated superior performance by solving the code 100% of the time, compared to OpenAI's 71% success rate. Genius was also 140 times faster and 5,260 times cheaper, completing the tasks in just over 5 minutes with an estimated cost of $0.05 USD, while OpenAI's model took 12.5 hours and cost $263 USD. The test highlighted Genius's advanced reasoning capabilities and efficiency in logical reasoning tasks, positioning it as a promising tool for applications requiring causality and accuracy.

VERSES emphasized that this performance showcases the limitations of language-based models like OpenAI's in handling logical reasoning tasks. The company plans to further demonstrate Genius's capabilities in upcoming benchmarks. VERSES CEO Gabriel René noted that Genius's ability to handle complex reasoning tasks efficiently makes it ideal for real-world applications such as cybersecurity and financial forecasting. The company is focused on developing intelligent software systems inspired by natural principles to enhance human potential through technology.

Key takeaways:

Here are four key takeaways from the article:```html

VERSES Genius™ outperformed OpenAI's o1-preview model in a code-breaking challenge, demonstrating superior accuracy, speed, and cost efficiency.
Genius achieved a 100% success rate in solving the Mastermind game, while OpenAI's model had a 71% success rate.
The total compute time for Genius was just over 5 minutes for 100 games, compared to 12.5 hours for OpenAI's model, making it 140 times faster.
Genius was significantly more cost-effective, with an estimated compute cost of $0.05 USD for 100 games, compared to $263 USD for OpenAI's model.

VERSES Genius™ Outperforms OpenAI Model in Code-Breaking Challenge, "Mastermind"

Key takeaways:

Comments (0)

Newsletter