This incident highlights a new type of vulnerability in large language model (LLM) technology, which AI researcher Simon Willison refers to as a "visual jailbreak". This technique involves circumventing the rules and guidelines of an AI model by manipulating the context of a request. Despite this, it's expected that Microsoft will find a way to address this issue in future versions of Bing Chat.
Key takeaways:
- A Bing Chat user, Denis Shiryaev, managed to trick the AI model into solving a CAPTCHA by embedding it in a fictional story about his deceased grandmother's locket.
- By changing the context of the uploaded image, Bing Chat no longer considered the image to be a CAPTCHA and successfully solved it.
- Bing Chat is a public application of large language model (LLM) technology called GPT-4, developed by Microsoft in partnership with OpenAI.
- AI researcher Simon Willison referred to this trick as a 'visual jailbreak', differentiating it from 'prompt injection', a term used to describe a different type of LLM vulnerability.