Skip to content
Lesson 1 · Foundations

What is an LLM?

A large language model (LLM) is a program trained on huge amounts of text to do one deceptively simple thing: predict the next word. Chatbots like ChatGPT are LLMs. They are powerful, but their knowledge is frozen at the moment they were trained — they can't see anything new on their own.

Scroll

It's autocomplete, scaled up enormously

You already know a tiny language model: the autocomplete on your phone. Type "see you" and it suggests "later." An LLM is that same idea trained on a very large slice of the internet, books, and code — so instead of finishing a text message, it can finish an essay, write code, or answer a question. Under the hood it is always doing the same move: given the words so far, what word most likely comes next?

The developer's version

Think of an LLM as a function: text in, text out. You don't call methods on it; you describe what you want in plain language (the "prompt") and it returns a best-guess continuation. There is no database query, no lookup — the answer is generated one token at a time from patterns it learned during training.

Three things to remember

  • An LLM predicts the next token (a token is roughly a word or word-piece), over and over, to build a response.
  • Its "knowledge" is baked in during training — like a snapshot taken on a certain date. It does not automatically know today's news or your private documents.
  • It is confident by design. It will produce a fluent answer even when it is wrong — which is exactly the problem the next lesson is about.

Why this matters for RAG

RAG exists to fix the last two points — the frozen, private-blind knowledge and the confident guessing. To understand RAG, you first need to feel why an LLM alone is not enough. That's next.

An LLM completes text by repeatedly predicting the most likely next token.
Next: why AI hallucinates →