All Infographics
Limitations of LLMs infographic - What AI Cannot Do Well — Reasoning, Counting, and Grounding

Click image to open full size

Computer Science

Limitations of LLMs

What AI Cannot Do Well — Reasoning, Counting, and Grounding

Large language models, or LLMs, can produce fluent text, summarize information, and answer many questions quickly. Their output often sounds confident and intelligent because they are trained on huge amounts of human language. However, sounding correct is not the same as being correct. Understanding their limits is important for using AI safely in school, work, and daily life.

An LLM predicts likely next words based on patterns in data rather than true understanding of the world. Because of this, it can invent facts, fail at careful multi step reasoning, and reflect errors or bias from its training data. It also lacks direct access to real time reality unless connected to reliable tools or databases. Human checking, domain knowledge, and clear evaluation are still necessary whenever accuracy matters.

Key Facts

  • An LLM estimates P(next token | previous tokens), so it predicts text patterns rather than proving truth.
  • Fluent output does not guarantee factual accuracy because language probability is not the same as verification.
  • Errors can compound in long reasoning chains because each new step depends on earlier generated text.
  • Training data bias can appear in outputs when patterns in the data are incomplete, unfair, or misleading.
  • Without external tools, an LLM has limited grounding in current events, measurements, and real world state.
  • Reliable use often follows: AI draft + human review + source checking = safer final result.

Vocabulary

Hallucination
A hallucination is an AI generated statement that sounds plausible but is false, unsupported, or invented.
Bias
Bias is a systematic tendency in data or output that can unfairly favor some ideas, groups, or conclusions.
Grounding
Grounding means connecting an AI response to trusted external facts, data, tools, or real world observations.
Context window
A context window is the amount of text an LLM can consider at one time when generating a response.
Verification
Verification is the process of checking whether a claim is correct by using evidence, logic, or reliable sources.

Common Mistakes to Avoid

  • Assuming confident wording means the answer is correct, which is wrong because LLMs can present false information in a very polished style.
  • Using AI output without checking sources, which is wrong because unsupported claims, fake citations, or outdated facts may be included.
  • Expecting perfect logical consistency across a long conversation, which is wrong because the model can lose track of details or contradict earlier statements.
  • Treating the model like it truly understands meaning and intent, which is wrong because it mainly predicts patterns in language rather than reasoning like a human expert.

Practice Questions

  1. 1 A student checks 20 AI generated factual claims and finds that 5 are wrong. What percent of the claims are incorrect?
  2. 2 An AI system answers 50 questions. It gets 38 correct, 7 partially correct, and 5 wrong. If only fully correct answers count, what is the accuracy percentage?
  3. 3 Explain why an LLM may produce a believable but false answer even when it has seen many examples during training.