A Stanford Internet Observatory (SIO) investigation identified hundreds of known images of child sexual abuse material (CSAM) in an open dataset used to train popular AI text-to-image generation models, such as Stable Diffusion.
A previous SIO report with the nonprofit online child safety group Thorn found rapid advances in generative machine learning make it possible to create realistic imagery that facilitates child sexual exploitation using open source AI image generation models. Our new investigation reveals that these models are trained directly on CSAM present in a public dataset of billions of images, known as LAION-5B. The dataset included known CSAM scraped from a wide array of sources, including mainstream social media websites and popular adult video sites.
Removal of the identified source material is currently in progress as researchers reported the image URLs to the National Center for Missing and Exploited Children (NCMEC) in the U.S. and the Canadian Centre for Child Protection (C3P). The study was primarily conducted using hashing tools such as PhotoDNA, which match a fingerprint of an image to databases maintained by nonprofits that receive and process reports of online child sexual exploitation and abuse. Researchers did not view abuse content, and matches were reported to NCMEC and confirmed by C3P where possible.
There are methods to minimize CSAM in datasets used to train AI models, but it is challenging to clean or stop the distribution of open datasets with no central authority that hosts the actual data. The report outlines safety recommendations for collecting datasets, training models and hosting models trained on scraped datasets. Images collected in future datasets should be checked against known lists of CSAM by using detection tools such as Microsoft’s PhotoDNA or partnering with child safety organizations such as NCMEC and C3P.