News outlets are accusing Perplexity of plagiarism and unethical web scraping

Perplexity AI, a startup that uses AI to generate detailed responses to queries, has been accused of unethical practices, including plagiarism and illicit web scraping. The company, which uses open or commercially available AI models to translate internet information into answers, was called out by Forbes for allegedly plagiarizing one of its articles, and by Wired for allegedly scraping its website. Perplexity, backed by Nvidia and Jeff Bezos, maintains it has done nothing wrong and is operating within fair use copyright laws. The situation highlights the thin line between fair use and plagiarism, and between routine web scraping and unethical summarization in the age of generative AI.

The accusations against Perplexity involve two key concepts: the Robots Exclusion Protocol, which websites use to indicate they don’t want their content accessed by web crawlers, and fair use in copyright law, which allows the use of copyrighted material without permission in certain circumstances. Perplexity argues that summarizing a URL isn’t the same as crawling, and that it's just responding to a user's request to go to that URL. However, critics argue that this is a distinction without a difference, as visiting a URL and pulling information to summarize the text looks like scraping if done thousands of times a day. The startup is also accused of plagiarizing articles, but it argues that providing a summary of an article is within the bounds of fair use.

Key takeaways:

Perplexity AI, a startup that combines a search engine with a large language model, has been accused of unethical practices including plagiarism and illicit web scraping.
Forbes and Wired have accused Perplexity of plagiarizing their articles, while Wired also accused the company of ignoring the Robots Exclusion Protocol to scrape website content.
Perplexity maintains it is operating within the bounds of fair use copyright laws and has not done anything wrong. The company is also working on advertising revenue-sharing deals with publishers.
The situation highlights the complexities and nuances of fair use and the Robots Exclusion Protocol in the age of AI, with potential implications for the future of content creation and monetization.

News outlets are accusing Perplexity of plagiarism and unethical web scraping | TechCrunch

Key takeaways:

Comments (0)

Newsletter