Gruber further expresses his concern about the "opt-out" style of AI training, where the decision to disallow AI training webcrawlers lies with someone other than the content owner. He highlights the issue of people posting content they do not own on platforms they also do not own, like social media, where they may lack the knowledge or power to disallow AI training. He concludes by expressing his frustration about having to constantly block these bots on servers he controls.
Key takeaways:
- Public data should not be excluded on an opt-out basis for AI training, as it infringes on ownership and copyright laws.
- Just because content is published on the web, it doesn't mean it's free to use for AI training.
- People often post content they don't own, and they shouldn't have the right to decide whether this content can be used for AI training.
- The author struggles with the lack of control over whether their content, reposted by others, is used for AI training, especially on platforms they don't own like social media.