How Much Do LLMs Hallucinate in Document Q&A Scenarios? A 172-Billion-Token Study Across Temperatures, Context Lengths, and Hardware Platforms [TLDR: 25%]

RandAlThor@lemmy.ca · edit-2 6 hours ago

How Much Do LLMs Hallucinate in Document Q&A Scenarios? A 172-Billion-Token Study Across Temperatures, Context Lengths, and Hardware Platforms [TLDR: 25%]

Telorand@reddthat.com · 3 hours ago

One in 100. However, that is simple a measure of probability, so do not expect that to always be true for every 100 prompts.

For example, if you rolled a 100-sided die 100 times, it’s possible to get a one every time. In practice, it would likely be a mix. You might have a session where you get no wrong answers and times when you get several.

The problem is that ignorant people trust these models implicitly, because they sound convincing and authoritative, and many people are not equipped to be able to vet the information being generated (also notice I didn’t say “retrieved”).