If Even 0.001 Percent of an AI's Training Data Is Misinformation, the Whole Thing Becomes Compromised, Scientists Find

Large language models (LLMs), such as those used in chatbots like ChatGPT, are prone to errors and can propagate misinformation, particularly in sensitive fields like medicine. Researchers at New York University discovered that if just 0.001 percent of an LLM's training data is "poisoned" with misinformation, it can lead to significant errors. Despite these errors, corrupted LLMs still perform comparably to uncorrupted ones on standard benchmarks, posing a risk that could be overlooked in conventional evaluations. The study, published in _Nature Medicine_, highlights the dangers of using indiscriminately trained LLMs in healthcare, where misinformation could endanger patient safety.

The researchers demonstrated the ease of poisoning LLMs by injecting AI-generated medical misinformation into a dataset called "The Pile," which includes reputable sources like PubMed. They found that replacing a small fraction of training tokens with misinformation led to a notable increase in harmful content. Unlike direct attacks that require access to model weights, data poisoning only requires hosting harmful information online. This research underscores the urgent need for improved data provenance and transparency in LLM development, especially in medical applications. The study warns against using LLMs for diagnostic or therapeutic purposes until better safeguards are in place and calls for additional security research to ensure their reliability in critical healthcare settings.

Key takeaways:

Large language models (LLMs) are prone to errors and misinformation, especially in critical fields like healthcare.
Even a small amount of "poisoned" data can significantly impact the accuracy of LLMs, leading to increased propagation of harmful content.
Corrupted LLMs can still perform well on standard benchmarks, making it difficult to detect issues using conventional tests.
There is a need for improved data provenance and transparency in LLM development, particularly in healthcare, to prevent misinformation from compromising patient safety.

If Even 0.001 Percent of an AI's Training Data Is Misinformation, the Whole Thing Becomes Compromised, Scientists Find

Key takeaways:

Comments (0)

Newsletter