The researchers' work is significant given the increasing reliance on LLMs for various tasks, from college essays to job applications. Their method, which works across popular models and a broad range of subjects, could help prevent LLMs from providing false information. The researchers also found that most of the "alternative facts" provided by LLMs are a product of confabulation.
Key takeaways:
- Large Language Models (LLMs) often give false answers with confidence due to various reasons such as being trained on misinformation, inability to extrapolate from facts, or being incentivized to provide a falsehood.
- Researchers from the University of Oxford have found a way to determine when LLMs appear to be confabulating, or making things up, a habit that is common across all popular models and a broad range of subjects.
- LLMs are not trained for accuracy but to produce human-sounding phrasing based on the massive quantities of text they are trained on. If the training examples are few or inconsistent, LLMs synthesize a plausible-sounding but likely incorrect answer.
- The researchers focus on semantic entropy, which evaluates all the statistically likely answers evaluated by the LLM and determines how many of them are semantically equivalent. If a large number all have the same meaning, the LLM is likely uncertain about phrasing but has the right answer. If not, it is presumably prone to confabulation.