Hallucination Is Inevitable: An Innate Limitation of Large Language Models
arxiv.orgThis is pretty obvious no? LLMs are basically a lossy compression of their dataset. Being lossy, they will necessarily have error (hallucinations). Furthermore, the underlying data is human approximation of truth. Therefore it will have error as a well. Shall we publish a paper on the linked list?
Hmmm....couldn't you argue it the other way as well? I mean, a hallucination isn't just an incomplete fact present in the input set--it is a claim which wasn't in the input set.
That's not what we'd expect from a lossy compressor; we would expect incomplete or missing answers, but we wouldn't expect it to give us claims which were not in the input set.