The distinction is significant as it influences how we understand and address the inaccuracies produced by these models. If inaccuracies are seen as hallucinations, it implies that the AI is trying and failing to convey truthful information. However, the scholars argue that AI models do not have beliefs, intentions, or understanding, and their inaccuracies are not due to misperception or hallucination, but because they are designed to create text that looks and sounds right without any intrinsic mechanism for ensuring factual accuracy. OpenAI has stated that improving the factual accuracy of ChatGPT is a key goal, with GPT-4 being 40% more likely to produce factual content than GPT-3.5.
Key takeaways:
- Large language models (LLMs) like OpenAI’s ChatGPT, despite their impressive capabilities, are known for generating persistent inaccuracies, often referred to as “AI hallucinations.” Scholars argue that these inaccuracies are better understood as “bullshit.”
- The term “AI hallucination” is misleading as it implies that the AI has a perspective or an intent to perceive and convey truth, which it does not. The output of LLMs fits the definition of bullshit better than the concept of hallucination as they generate text based on patterns in the data they have been trained on, without any intrinsic concern for accuracy.
- Calling AI inaccuracies 'hallucinations' can lead to overblown hype about their abilities and suggests solutions to the inaccuracy problems which might not work. It can also lead to misguided efforts at AI alignment amongst specialists.
- OpenAI has stated that improving the factual accuracy of ChatGPT is a key goal and they have made progress in this area, with GPT-4 being 40% more likely to produce factual content than GPT-3.5.