Microsoft releases its internal generative AI red teaming tool to the public

Microsoft has unveiled a Python Risk Identification Toolkit for generative AI (PyRIT) to identify risks in generative AI systems. The tool, used by Microsoft's AI Red Team, has been employed to check for risks in its gen AI systems, including Copilot. The toolkit sends a malicious prompt to the generative AI system, and once it receives a response, its scoring agent gives the system a score, which is used to send a new prompt based on previous scoring feedback.

The toolkit has reportedly made Microsoft's red team efforts more efficient, significantly reducing the time taken for tasks. For example, in a red teaming exercise on a Copilot system, the team was able to generate several thousand malicious prompts and use PyRIT's scoring engine to evaluate the output from the Copilot system in hours instead of weeks. The toolkit is now available for access and includes a list of demos to help familiarize users with the tool.

Key takeaways:

Microsoft has released a Python Risk Identification Toolkit for generative AI (PyRIT) to help identify risks in generative AI systems.
The toolkit has been used by Microsoft's AI Red Team to check for risks in its gen AI systems, including Copilot.
PyRIT works by sending a malicious prompt to the AI system, and once it receives a response, its scoring agent gives the system a score, which is used to send a new prompt based on previous scoring feedback.
The toolkit is available for access today and Microsoft is also hosting a webinar on PyRIT to demonstrate how to use it in red teaming generative AI systems.

Microsoft releases its internal generative AI red teaming tool to the public

Key takeaways:

Comments (0)

Newsletter