1
Feature Story
A new, challenging AGI test stumps most AI models | TechCrunch
Mar 25, 2025 · techcrunch.com
The introduction of ARC-AGI-2 comes amid calls for new benchmarks to measure AI progress, particularly in traits like creativity. The Arc Prize Foundation has also launched a contest challenging developers to achieve 85% accuracy on ARC-AGI-2 with a cost constraint of $0.42 per task. This initiative highlights the ongoing need for effective measures of AI capabilities beyond mere problem-solving, focusing on the efficiency and cost of skill acquisition.
Key takeaways
- The Arc Prize Foundation has introduced a new test, ARC-AGI-2, to measure AI models' general intelligence, which has proven challenging for most models.
- ARC-AGI-2 focuses on efficiency and the ability to interpret patterns on the fly, addressing flaws in the previous ARC-AGI-1 test.
- Human participants averaged 60% accuracy on ARC-AGI-2, significantly outperforming AI models, which scored between 1% and 4%.
- The Arc Prize Foundation announced a new contest, challenging developers to achieve 85% accuracy on ARC-AGI-2 with a cost constraint of $0.42 per task.