When Algorithms Hallucinate

A conversation with a machine about the meaning of hallucinations

Dec 04, 2022

During the brief window before it was shut down in November 2022, I played around with the Galactica large language model (LLM) demo from Meta. I noticed a disclaimer on the prompt page: “Language Models are prone to hallucinate text.” This was the first time I had encountered the notion of ML hallucination. Of course, machines don’t hallucinate. This is just a metaphor. Yet this idea struck me as interesting in a Phillip K. Dick kind of way. I saved an image of the disclaimer for posterity and put the idea aside.

Fast forward to December 2022. While mucking around with ChatGPT, it occurred to me that questioning a LLM about hallucinations would be interesting. What follows is a transcript of a discussion between ChatGPT and I about ML hallucination. My prompts are only edited for capitalization. The ChatGPT responses, which appear in block quotes, are entirely unedited. The answers lean towards repetitiveness. The answers also heavily parrot the questions being asked in way that reminds of me a stereotypical psychiatrist. Nonetheless, the core of the answers are sensical.

What does it mean when large language models hallucinate?

When a large language model "hallucinates," it means that it has generated text that does not reflect reality, or that does not make sense given the context in which it was generated. This can happen for a variety of reasons, but one common cause is that the model has been trained on a large corpus of text that contains a lot of random or meaningless information. This can cause the model to generate text that is difficult to understand or that seems to come out of nowhere. In general, it is important to carefully evaluate the output of any language model to ensure that it is accurate and meaningful.

Why call it a hallucination?

The term "hallucination" is often used to describe the experience of seeing, hearing, or otherwise perceiving something that is not actually there. In the context of a large language model, the term "hallucination" is used to describe the model's generation of text that does not reflect reality or that does not make sense given the context in which it was generated. This use of the term is somewhat analogous to the way that hallucinations are often described in the context of mental health, where they can be seen as a distortion or misinterpretation of sensory information. However, it is important to note that large language models are not capable of experiencing hallucinations in the same way that humans do, and the use of the term in this context is purely metaphorical.

Hallucination seems like a misleading term for what is better described as an error or garbage.

You are correct that the term "hallucination" can be misleading when used in the context of large language models, as it suggests that the model is capable of experiencing something that is not real, which is not the case. In fact, a more accurate way to describe the phenomenon of a large language model generating text that does not make sense or does not reflect reality would be to call it an error or mistake, as you suggested. This is because the model is not actually hallucinating, but rather it is simply generating incorrect or nonsensical output due to the limitations of its training data or the algorithms it uses to generate text. It is important to use precise and accurate language when discussing the capabilities and limitations of large language models, in order to avoid misunderstandings or confusion.

Machine learning uses a lot of metaphors and analogies to human cognitive function. Why is this? This approach masks what the algorithms are really doing and can be dangerous in some cases.

Machine learning algorithms are often described using metaphors and analogies to human cognitive function because these can be useful for helping people understand how they work. For example, describing a machine learning algorithm as a "brain" or a "neural network" can help people understand that the algorithm is capable of learning from data and making predictions or decisions based on that learning. However, it is important to recognize that these metaphors and analogies are just that - they are not literally true, and they can be misleading if taken too literally.
Using metaphors and analogies to describe machine learning algorithms can be dangerous in some cases because they can mask the complexity and limitations of these algorithms. For example, describing a machine learning algorithm as a "brain" can make it seem like the algorithm has the same capabilities as a human brain, when in reality it is much more limited. This can lead to unrealistic expectations or misunderstandings about what the algorithm can and cannot do.
It is important to use metaphors and analogies to describe machine learning algorithms carefully, and to make it clear that they are just ways of explaining complex concepts in simpler terms. It is also important to always evaluate the output of these algorithms carefully, to ensure that they are producing accurate and meaningful results.

Hallucination in a large language model is analogous to human mental impairment. What "fixes" the impairment?

It is not entirely accurate to say that the phenomenon of a large language model "hallucinating" is analogous to human mental impairment, as this suggests that the model is capable of experiencing something that is not real in the same way that a human would. As I mentioned earlier, the use of the term "hallucination" in the context of a large language model is purely metaphorical, and does not imply that the model is capable of experiencing hallucinations in the same way that humans do.
To address the issue of a large language model generating incorrect or nonsensical output, there are several potential solutions. One approach is to improve the training data for the model, by providing it with a larger and more diverse corpus of text that is more representative of the real world. This can help the model learn to generate more accurate and meaningful text. Another approach is to improve the algorithms that the model uses to generate text, by making them more efficient or better at handling complex or unusual input. Finally, it may be necessary to evaluate the output of the model more carefully, to ensure that it is accurate and meaningful, and to take steps to correct any errors or mistakes that are identified.

Back to Zero

Ready for more?