In response to these lawsuits, Meta has admitted to using portions of the Books3 dataset to train its Llama AI model, but denies allegations of copyright infringement. The tech giant argues that consent or compensation is not necessarily required for the use of copyrighted works to train AI, and that any unauthorized copies of copyrighted works constitute fair use. The fair use defense is expected to be a key part of these and other AI lawsuits, which are still in their early stages and could potentially reach the Supreme Court.
Key takeaways:
- Several rightsholders, including record labels, authors, and the New York Times, have filed lawsuits against companies that develop AI models, alleging the use of their work without proper compensation.
- The lawsuits often involve the use of the Books3 dataset, created by AI researcher Shawn Presser, which was scraped from the library of 'pirate' site Bibliotik and used to train AI models by tech companies like Meta and OpenAI.
- Meta has admitted to using portions of the Books3 dataset to train its Llama AI model, but denies allegations of copyright infringement, suggesting that its use of copyrighted works did not require consent, credit, or compensation.
- Meta plans to rely on a fair use defense, arguing that any unauthorized copies of copyrighted works constitute fair use under U.S. law. This fair use angle is expected to be a key part of this and other AI lawsuits.