26% of Top 100 Websites Have Now Blocked GPTBot

OpenAI's web crawler, GPTBot, is being blocked by a significant number of top websites, hindering its data scraping capabilities. Specifically, 26 of the top 100 websites and 242 of the top 1,000 websites have barred GPTBot, a sharp increase from just a month ago when only 69 of the top 1,000 had done so. This represents a 250% increase and includes major brands such as Pinterest, The Guardian, USA Today, the Washington Post, CBS News, Web MD, and dictionary.com.

The main reason for this blocking is that OpenAI's ChatGPT does not provide references or sources for the information it provides, potentially denying these websites the attributions they require for their original content. This issue is causing a higher proportion of blocks for GPTBot compared to other web scrapers like CCBot and Anthropic AI.

Key takeaways

Over a quarter of the top 100 websites have blocked OpenAI's web crawler, GPTBot, from scraping their data.
26 of the top 100 websites and 242 of the top 1,000 websites have barred GPTBot entirely.
There has been a 250% increase in the number of websites blocking GPTBot in just a month, with the bot being blocked at a higher proportion than other scrapers such as CCBot and Anthropic AI.
Major brands like Pinterest, The Guardian, USA Today, the Washington Post, CBS News, Web MD and dictionary.com are among those that have blocked GPTBot, largely due to the bot's lack of providing references or sources for the information it provides.

26% of Top 100 Websites Have Now Blocked GPTBot

Key takeaways

Discussion (0)