Skip to content
Lesson 9 · In the wild

Where generative AI is used

Generative AI now powers image tools (Midjourney, DALL·E, Imagen), design and marketing, video and audio creation, and multimodal assistants (GPT-4o, Claude, Gemini) that can see and read at once. The same building blocks from this course sit under all of them.

Scroll

Real-world uses

  • Image creation — Midjourney, DALL·E, Stable Diffusion, Imagen turn prompts into art, mockups, and product shots.
  • Design & marketing — logos, ad variations, storyboards, and concept art in seconds.
  • Video & audio — text-to-video and voice/music generation build on the same diffusion ideas.
  • Multimodal assistants — GPT-4o, Claude, Gemini answer questions about images, charts, and screenshots.

The same blocks, everywhere

Behind this variety is a small set of ideas you now know: a generative model that learns a pattern; diffusion that denoises noise into images; CLIP-style shared spaces that link words and pictures; and multimodal models that read text and images together. New tools mostly recombine these blocks.

You've finished the course

You can now explain generative AI end to end: what a generative model is, how GANs and VAEs make images, how diffusion denoises noise into pictures, how multimodal AI and CLIP link words with images, and how text-to-image ties it together. Ready to go deeper? The links below continue the journey.

One toolkit — generative models, diffusion, CLIP — behind many everyday AI tools.
Explore the full question bank →