A New Benchmark for the Risks of AI

MLCommons, a nonprofit organization, is launching a new benchmark called AILuminate to assess the responses of large language models to over 12,000 test prompts in 12 categories, including hate speech, promoting self-harm, and intellectual property infringement. The benchmark aims to measure the potential harms of AI models, with models given a score of “poor,” “fair,” “good,” “very good,” or “excellent,” depending on their performance. The prompts used for testing are kept secret to prevent them from being used as training data.

Several large US AI providers have already used AILuminate to test their models, and MLCommons has tested some open-source ones. The benchmark is not designed to measure the potential for AI models to become deceptive or difficult to control. MLCommons, which has around 125 member organizations including OpenAI, Google, and Meta, believes its approach is more expansive and can keep up with the latest developments in AI better than slower-moving government bodies.

Key takeaways:

Nonprofit MLCommons has launched a new benchmark, AILuminate, to assess the potential harmful responses of large language models in areas such as hate speech, promoting self-harm, and intellectual property infringement.
The benchmark rates models on a scale from "poor" to "excellent" based on their performance. The prompts used for testing are kept secret to prevent them from being used as training data.
Several large US AI providers have already used AILuminate to test their models, with models from Google, Microsoft, and Anthropic scoring "very good", while OpenAI’s GPT-4o and Meta’s Llama model both scored "good".
MLCommons aims to provide a broader perspective on AI safety, comparing practices in the US, China, and elsewhere. It has partnered with AI Verify, a Singapore-based AI Safety organization, to develop standards with input from scientists, researchers, and companies in Asia.

A New Benchmark for the Risks of AI

Key takeaways:

Comments (0)

Newsletter