Meanwhile, Meta, owner of Facebook and Instagram, considered purchasing the publishing house Simon & Schuster to obtain long works and discussed gathering copyrighted data from across the internet, even if it risked lawsuits. The company's managers, lawyers, and engineers argued that negotiating licenses with publishers, artists, musicians, and the news industry would be too time-consuming. The actions of these tech companies highlight the desperate race for digital data to advance AI technology.
Key takeaways:
- OpenAI, Google, and Meta have reportedly ignored corporate policies and discussed skirting copyright law in their quest for online information to train their artificial intelligence systems.
- OpenAI developed a tool called Whisper to transcribe YouTube videos for conversational text to train their AI, potentially against YouTube's rules.
- OpenAI's team transcribed over a million hours of YouTube videos, which were then used to train a system called GPT-4, one of the world's most powerful AI models.
- At Meta, managers, lawyers, and engineers discussed buying Simon & Schuster to procure long works and considered gathering copyrighted data from across the internet, even at the risk of facing lawsuits.