The lawsuit, which began in December last year, is still in the discovery phase. OpenAI had to provide its training data to the NYT but hasn't publicly revealed the exact information used to build the AI models. The NYT's legal team spent over 150 hours researching the data on one of the virtual machines provided by OpenAI before the data was deleted. OpenAI acknowledged the deletion, attributing it to a "glitch", but the restored data was missing the NYT's work, forcing the newspaper to start its research from scratch. The NYT's lawyers stated they don't believe the deletion was intentional.
Key takeaways:
- OpenAI's engineers accidentally erased evidence of the AI’s training data that was part of the New York Times' plagiarism lawsuit, according to a court declaration filed by the newspaper.
- Although some of the data was recovered, the original file names and folder structure that show when the AI copied its articles into its models are still missing.
- OpenAI spokesperson Jason Deutrom disagreed with the New York Times' claims and stated that the company will file a response soon.
- The New York Times' legal team had to recreate their work from scratch after more than 150 hours of research was deleted due to a 'glitch' acknowledged by OpenAI.