Large language models validate misinformation, research finds

Researchers at the University of Waterloo have found that large language models, such as GPT-3, often repeat misinformation, including conspiracy theories and harmful stereotypes. The study tested GPT-3's understanding of six categories: facts, conspiracies, controversies, misconceptions, stereotypes, and fiction. The researchers found that GPT-3 frequently made mistakes, contradicted itself, and repeated harmful misinformation. The study also revealed that even slight changes in the wording of a statement could significantly alter GPT-3's response.

The study's findings raise concerns about the trustworthiness of large language models, as they are continuously learning and may be learning misinformation. The researchers warn that these models' inability to distinguish truth from fiction could pose a significant challenge to trust in these systems. The study, titled "Reliability Check: An Analysis of GPT-3’s Response to Sensitive Topics and Prompt Wording," was published in the Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing.

Key takeaways:

Researchers at the University of Waterloo found that large language models like GPT-3 often repeat conspiracy theories, harmful stereotypes, and other forms of misinformation.
The study revealed that GPT-3 frequently made mistakes, contradicted itself, and agreed with incorrect statements between 4.8% and 26% of the time, depending on the statement category.
Even slight changes in wording could significantly alter the model's response, making it unpredictable and potentially dangerous as these models become more ubiquitous.
The inability of large language models to distinguish truth from fiction raises serious questions about trust in these systems, according to the researchers.

Large language models validate misinformation, research finds | Waterloo News

Key takeaways:

Comments (0)

Newsletter