The lawsuit, filed by author Richard Kadrey, comedian Sarah Silverman, and others, accuses Meta of violating intellectual property laws by using illegally obtained content. Meta has argued that using copyrighted material for training data should be considered fair use. The case highlights the broader issue of data scarcity in AI development, with companies like Meta and OpenAI exploring unconventional methods to acquire unique data. Despite a partial dismissal of the lawsuit, the evidence could bolster the plaintiffs' case as it progresses in court.
Key takeaways:
- Meta is facing a major copyright lawsuit for allegedly using pirated data to train its AI models, Llama, and attempting to conceal it.
- Internal communications suggest Meta considered using the book piracy site Library Genesis (LibGen) to achieve state-of-the-art performance in AI models.
- Meta's internal documents reveal efforts to obscure copyright information in training data to avoid legal complications.
- The lawsuit evidence could strengthen the case against Meta as it progresses in court, despite a partial dismissal last year.