1
Feature Story
Large AI Dataset Has Over 1,000 Child Abuse Images, Researchers Find
Dec 20, 2023 · bloomberg.com
The report warns that the presence of such material in the dataset could lead to the creation of new and potentially realistic child abuse content by AI products built on this data. This includes image generation tools like Stable Diffusion.
Key takeaways
- The public dataset LAION-5B, used for building AI image generators, contains at least 1,008 instances of child sexual abuse material, as per a report from the Stanford Internet Observatory.
- LAION-5B includes more than 5 billion images and related captions from the internet.
- The dataset may also contain thousands of additional pieces of suspected child sexual abuse material.
- The inclusion of such material in the dataset could potentially enable AI products, like image generation tools, to create new and realistic child abuse content.