New Data Shows AI Companies Love 'Premium Publisher' Content

Large language models (LLMs) like ChatGPT, Google Gemini, and Meta AI heavily rely on content from premium publishers for training, despite downplaying their use of such copyrighted content, according to research from Ziff Davis. The research suggests that AI companies intentionally filter out low-quality content in favor of high-quality, human-made content to train their models, using websites' domain authority or their ranking in Google search to make those distinctions. However, this practice has led to disputes with publishers who argue that AI companies are pirating their copyrighted work without permission or compensation.

The issue of transparency also arises, as companies behind popular AI chatbots have been secretive about their information sources, leading to concerns about reliability and potential bias. Some media companies have sued AI developers for copyright infringement. Despite this, tech giants like Google and Meta have seen tremendous valuations amid the AI revolution, while news publishers struggle in a highly competitive online media environment. Some AI companies have signed licensing deals with publishers to use their content, but many have not disclosed the training data used for their most recent models.

Key takeaways

Large language models (LLMs) like ChatGPT, Google Gemini, and Meta AI rely heavily on content from premium publishers for training, despite downplaying their use of such copyrighted content, according to a research by Ziff Davis.
AI companies have been accused of pirating copyrighted work without permission or compensation, leading to lawsuits from media companies for copyright infringement.
Big Tech companies have seen tremendous valuations amid the AI revolution, with Google valued at about $2.2 trillion, and Meta at about $1.5 trillion, largely due to their work with generative AI.
Some AI companies have signed licensing deals with publishers to feed their LLMs with up-to-date news articles, including OpenAI's deal with the Financial Times, DotDash Meredith, Vox and others.

New Data Shows AI Companies Love 'Premium Publisher' Content

Key takeaways

Discussion (0)