top of page
  • Max Huo

Why does ChatGPT Lie to me? (Hallucinations and Emergent Behavior)

With the recent emergence of natural language generation using AI, programs such as ChatGPT have gained popularity. However, while using such programs, people have begun to discover that ChatGPT or similar natural language generation programs tend to provide factually inaccurate information to the user. From misquoting people to contradicting itself, such a tendency to generate factually incorrect information could cause the spreading of misinformation.  This is called “hallucinating,” to understand why this happens, we have to realize what ChatGPT was initially built for and how it was trained. 


What is ChatGPT?


ChatGPT is a Large Language Model that can be used to generate text. In other words, it takes in text and attempts to predict the resulting conversation from past experiences. 


The earliest model created by the company behind ChatGPT, OpenAI, was GPT-1, which was released in June 2018. Its training data was only 4.5GB of text, specifically around 7000 unpublished books of various genres. It was elementary, and it could only give three types of responses to pairs of words describing their relationship. 


Later on, OpenAI developed a second iteration of the GPT series, GPT-2, having a limited version released in February of 2019 and a more complete version of the software released in November of the same year. It had a significant step up in data fed into the software, having a total training data size of 40GB. This time, the sources originated from 45 million web pages that were uploaded onto the popular social media page Reddit (these significant amounts of API calls by ChatGPT were also one of the reasons Reddit decided to add API costs, and in turn, why so many 3rd party apps were forced to shut down). GPT-2 was more advanced, able to summarize paragraphs of text, answer simple questions, and even attempt to translate short passages successfully. Generally, it was a massive step up from GPT-1. 


Less than a year later, OpenAI released a new version of ChatGPT, GPT-3, in May 2020. With a more than tenfold increase in training data, GPT-3 was trained off of a 570GB archive and the entirety of Wikipedia. The software itself wasn’t changed much compared to GPT-1 and -2, with only slight modifications to expand the training data the program could take in. It was significantly more advanced than GPT-2, able to write poetry, articles, resumes, and other large chunks of text with a degree of accuracy that almost matched that of regular people. 


The current versions of ChatGPT that are most commonly used today are GPT-3.5 or GPT-4. OpenAI hasn’t disclosed either of their training data sizes yet. However, judging from the performance jump between GPT-3 and 3.5, we can assume that it was expanded to incorporate a significantly more comprehensive array of data. They can talk about anything and do almost anything it (or OpenAI) deems legal. Outside of particular jailbreaking techniques, they are very hard to force into generating anything they view as illegal. This brings us to our problem - with such a strict limit on what it can generate, how does ChatGPT generate factually incorrect information without realizing it?  


Hallucinations


In a paper published by the Association for Computational Linguistics, a group of researchers attempted to ask GPT-3.5, GPT-4, and other similar large language models a series of questions that made references to non-existent people and places. These programs responded not that they did not recognize the people or places in the question but responded in a significant portion of the results with a grammatically correct but utterly made-up answer. 


This happens because ChatGPT, at its core, is a text prediction software. Yes, it has features that incorporate factual information within its responses. Still, when it encounters something that doesn’t have any information on the basics, it just predicts what should be written as a response based on previously observed patterns within its training data. This also applies when ChatGPT generates text with non-existent quotes: it doesn’t use the already established data, whether because it doesn’t have access to it or because it can’t use it rather it generates a quote that it thinks might have occurred, and uses that as a factually correct source. 


Another potential reason is the data given for it to train. ChatGPT is based on a vast amount of data, as previously stated, and although we are not aware of the proper size of ChatGPT’s training data, we know that it has billions, perhaps trillions of sources. What could easily be the case would be ChatGPT giving information that it thinks is relevant to the conversation and adding that in. This results in conversations that seem irrelevant to the original question. 


Solutions


A solution that OpenAI could implement would be a real-time fact-checking system that could theoretically reduce the amount of factually incorrect information ChatGPT generates. It could be trained in parallel with GPT and would react to or highlight parts where GPT would have likely made a factual error. However, the intrinsic problem remains with the system: if GPT doesn’t realize it's lying, why would such a program do so?


A far simpler solution is for users to fact-check the information these large language models give. ChatGPT and its counterparts can very easily be manipulated to spread misinformation, and one of the easiest ways to combat it would be for people to find other sources and fact check. Misinformation might be unavoidable, but we can ensure we don’t fall for it when we see some. 



Reference List


Ramakrishna, A., Gupta, R., Lehmann, J. and Morteza Ziyadi (2023). INVITE: a Testbed of Automatically Generated Invalid Questions to Evaluate Large Language Models for Hallucinations. [online] doi:https://doi.org/10.18653/v1/2023.findings-emnlp.360.


Ye, H., Liu, T., Zhang, A., Hua, W. and Jia, W. (2023). Cognitive Mirage: A Review of Hallucinations in Large Language Models. [online] arXiv.org. Available at: https://arxiv.org/abs/2309.06794 [Accessed 27 Feb. 2024].



Comments


bottom of page