Internal ByteDance documents reveal that the company has relied on the OpenAI API for nearly every phase of Project Seed's development, including training and evaluating the model. Employees involved are aware of the implications and have discussed ways to "whitewash" the evidence. The misuse is so widespread that Project Seed employees regularly exceed their maximum allowance for API access.
Key takeaways:
- ByteDance, the parent company of TikTok, has been secretly using OpenAI’s technology to develop its own large language model, codenamed Project Seed, which is a violation of OpenAI’s terms of service.
- ByteDance is buying its OpenAI access through Microsoft, which also has a policy against using its model output to develop competing AI models.
- Internal ByteDance documents confirm that the OpenAI API has been relied on during nearly every phase of Project Seed's development, including for training and evaluating the model.
- There are discussions within ByteDance about how to "whitewash" the evidence of this misuse through "data desensitization", and the misuse is so rampant that Project Seed employees regularly hit their max allowance for API access.