Skip to content
Lesson 5 · Retrieval

Semantic search: finding by meaning

Semantic search finds text by meaning instead of exact keywords. You embed your documents once and store the vectors; then you embed the question and grab the stored vectors closest to it. That's why a search for "how do I get my money back" can find a page titled "Refund policy" — no shared words, but very close in meaning.

Scroll

Keyword search vs. meaning search

Old-style search matches words: type "refund" and it finds pages containing "refund." Miss the exact word and you get nothing. Semantic search matches meaning: it turns your query into a vector and finds the stored vectors nearest to it, so "get my money back" and "refund policy" land together even though they share no words.

How semantic search works

  1. Ahead of time (indexing): split your documents into chunks, embed each chunk into a vector, and store them.
  2. At question time: embed the user's query into a vector.
  3. Compare the query vector to the stored vectors and return the closest few — those are your best matches.

Where a vector database comes in

If you have thousands or millions of chunks, comparing the query against every single one is slow. A vector database is a specialized store built to find the nearest vectors fast, even among millions. Think of it as an index — but an index organized by meaning instead of by alphabetical keyword.

We've built every piece of RAG

Retrieve-by-meaning is the "Retrieval" in Retrieval-Augmented Generation. You now have all the parts: an LLM that generates, and semantic search that retrieves. The next lesson snaps them together.

The query becomes a vector; the nearest stored chunks are the matches.
Next: what is RAG? →