Although OpenAI managed to recover most of the data, the folder structure and file names were irretrievably lost, making it difficult for the attorneys to determine where the copied articles were used to build OpenAI’s models. The attorneys are now asking the judge to make OpenAI do the legwork caused by the error, arguing that OpenAI is best positioned to search its own datasets. Despite the setback, the attorneys do not believe the erasure was intentional. OpenAI, however, disagrees with the characterizations made and plans to file a response soon.
Key takeaways:
- OpenAI engineers accidentally deleted a large amount of evidence that was being used in a copyright lawsuit by the New York Times and the New York Daily News.
- The company was able to recover most of the data, but the folder structure and file names were lost, making it difficult for the newspapers' attorneys to determine where their articles were used in OpenAI's models.
- The attorneys are asking the judge to make OpenAI do the work to search its own datasets, as they believe the company is best positioned to do so.
- Despite the request, OpenAI may be planning a rebuttal, with a spokesperson stating they disagree with the characterizations made and will file a response soon.