OpenAI blamed NYT for tech problem erasing evidence of copyright abuse

OpenAI, an AI company, has been accused by The New York Times (NYT) of unintentionally deleting data that could serve as evidence of copyright abuse. The newspaper alleges that OpenAI trained its ChatGPT on authors' works, violating copyright laws. The NYT spent over 150 hours extracting training data, but some of it was erased due to what OpenAI called a "glitch". OpenAI denied deleting any evidence and blamed the NYT for the technical problem that led to the data deletion.

This is not the first time OpenAI has been accused of deleting data in a copyright case. In May, authors including Sarah Silverman and Paul Tremblay told a US district court that OpenAI admitted to deleting controversial AI training data sets. OpenAI's defense hinges on the argument that copying authors' works to train AI is a transformative fair use that benefits the public. However, the judge in the NYT case rejected a key part of that fair use defense last week.

Key takeaways:

The New York Times (NYT) has accused OpenAI of unintentionally erasing data that could be used as evidence of copyright abuse, a claim that OpenAI denies, blaming the NYT for the technical problem that triggered the data deletion.
OpenAI has been previously accused of deleting data in a copyright case by book authors, including Sarah Silverman and Paul Tremblay, who alleged that OpenAI admitted to deleting controversial AI training data sets.
OpenAI's defense in the NYT case primarily hinges on courts agreeing that copying authors' works to train AI is a transformative fair use that benefits the public, but Judge Ona Wang rejected a key part of that fair use defense.
If OpenAI loses the case and the use of copied content is not seen as transformative, it could have implications for the book authors' suit and other litigation, potentially dragging into 2026.

OpenAI blamed NYT for tech problem erasing evidence of copyright abuse

Key takeaways:

Comments (0)

Newsletter