The models were tested using books under copyright protection, and the results showed that GPT-4 completed book texts 60% of the time, and generated the first passage 26% of the time. Mixtral and Llama generated the first passage of books when prompted 38% and 10% of the time, respectively. Patronus AI stressed the importance of catching these mistakes to avoid legal action and risks to a company’s reputation.
Key takeaways:
- Research by Patronus AI found that some of the top AI models generate copyrighted content at an alarmingly high rate.
- The AI models evaluated were OpenAI’s GPT-4, Anthropic’s Claude 2.1, Mistral’s Mixtral, and Meta’s Llama 2, with GPT-4 generating the most copyrighted content at 44%.
- Patronus AI tested the models using books under copyright protection and found that some generations can be covered by fair use laws in the U.S.
- Patronus AI emphasized the importance of catching these copyright violations to avoid legal action and risks to a company’s reputation.