The main reason for this blocking is that OpenAI's ChatGPT does not provide references or sources for the information it provides, potentially denying these websites the attributions they require for their original content. This issue is causing a higher proportion of blocks for GPTBot compared to other web scrapers like CCBot and Anthropic AI.
Key takeaways:
- Over a quarter of the top 100 websites have blocked OpenAI's web crawler, GPTBot, from scraping their data.
- 26 of the top 100 websites and 242 of the top 1,000 websites have barred GPTBot entirely.
- There has been a 250% increase in the number of websites blocking GPTBot in just a month, with the bot being blocked at a higher proportion than other scrapers such as CCBot and Anthropic AI.
- Major brands like Pinterest, The Guardian, USA Today, the Washington Post, CBS News, Web MD and dictionary.com are among those that have blocked GPTBot, largely due to the bot's lack of providing references or sources for the information it provides.