Skip to content
Lesson 4 · Diffusion

How diffusion models work

Diffusion models learn by destroying, then rebuilding. In training, they add noise to a real image bit by bit until it's pure static — then learn to reverse each step. To make a new image, they start from pure noise and undo it, guided toward what you asked for.

Scroll

Learn to reverse the mess

Diffusion has a clever training trick. Take a real photo and add a little random static, then a little more, then more — after enough steps it's pure noise, like an untuned TV. At each step the model is shown "here's the noisier version; what noise did I just add?" By learning to predict the noise, it learns how to remove it — how to walk the process backwards.

The two directions

  1. Forward (training only): start from a real image, add noise step by step until it's pure static.
  2. Reverse (generation): start from pure static, remove a little noise each step, and a new image appears.
  3. The model never memorises photos — it learns the general skill of turning noise into something realistic.

A sculptor and a block of marble

A sculptor starts with a rough block and chips away, a little at a time, until a statue emerges. Diffusion starts with a block of noise and "chips away" the randomness step by step until an image emerges. It never carves the same statue twice — start from different noise and you get a different picture.

One key detail

The model doesn't predict the finished image in one shot — it predicts the small bit of noise to remove right now, and repeats. That patience is why diffusion images look so good. Next: watch that denoising happen, step by step.

Forward adds noise (training); reverse removes it, step by step (generation).
Next: watch it denoise →