Legions of DEF CON hackers will attack generative AI models

The 31st annual DEF CON will host the largest red-teaming exercise for AI models, the Generative Red Team (GRT) Challenge. The event will see hackers attack large language models from companies like Google, OpenAI, and Nvidia, among others, to identify potential weaknesses. The challenge, supported by the Biden-Harris administration, is aligned with the goals of the AI Bill of Rights and the NIST AI Risk Management Framework.

The event will provide 150 laptop stations and timed access to multiple language models, with a capture-the-flag style point system to encourage testing a wide range of potential harms. The individual with the highest number of points will win a high-end Nvidia GPU. The AI companies involved are particularly interested in feedback on embedded harms and emergent risks from scaling these technologies. The event will also focus on the multilingual harms and internal consistency of AI models.

Key takeaways:

The 31st annual DEF CON will host the largest red-teaming exercise ever for AI models, known as the Generative Red Team (GRT) Challenge, aiming to identify weaknesses in AI models.
Models provided by companies like Anthropic, Cohere, Google, Hugging Face, Meta, Nvidia, OpenAI and Stability will be tested on an evaluation platform developed by Scale AI.
The challenge is supported by the White House Office of Science, Technology, and Policy (OSTP) and aligns with the goals of the Biden-Harris Blueprint for an AI Bill of Rights and the NIST AI Risk Management Framework.
The AI companies providing their models are most excited about the feedback they will get, particularly about the embedded harms and emergent risks that come from automating these new technologies at scale.

Legions of DEF CON hackers will attack generative AI models

Key takeaways:

Comments (0)

Newsletter