Generative models learn to sample from, score, or transform data distributions. They are the mathematical foundation behind language generation, image generation, latent-variable modeling, diffusion models, and synthetic data.
Overview
The goal is to model:
Different model families choose different paths: autoregressive factorization, latent-variable bounds, adversarial games, invertible transformations, or iterative denoising.
Prerequisites
- Probability, likelihood, KL divergence, and ELBO
- Neural networks and optimization
- Autoregressive sequence probability
- Basic Gaussian sampling and matrix operations
Companion Notebooks
| Notebook | Purpose |
|---|---|
| theory.ipynb | Demonstrates autoregressive likelihood, VAE reparameterization and KL, GAN losses, flow change of variables, diffusion noising, score updates, and FID intuition. |
| exercises.ipynb | Ten practice problems for generative-model objectives and diagnostics. |
Learning Objectives
After this section, you should be able to:
- Compare autoregressive models, VAEs, GANs, flows, diffusion models, and score models.
- Compute autoregressive log likelihood.
- Write and interpret the VAE ELBO and reparameterization trick.
- Explain GAN minimax training and mode collapse.
- Apply the change-of-variables formula for a simple flow.
- Simulate diffusion noising and denoising loss.
- Explain score matching and guidance.
- Evaluate generative models with likelihood, diversity, sample quality, and FID-style statistics.
Table of Contents
- Generative Modeling Goal
- 1.1 Data distribution
- 1.2 Model distribution
- 1.3 Sampling
- 1.4 Likelihood
- 1.5 Conditional generation
- Autoregressive Models
- 2.1 Chain rule
- 2.2 Teacher forcing
- 2.3 Sampling loop
- 2.4 Exact likelihood
- 2.5 Cost
- Variational Autoencoders
- 3.1 Latent variable model
- 3.2 Encoder
- 3.3 Decoder
- 3.4 ELBO
- 3.5 Reparameterization
- GANs
- 4.1 Generator
- 4.2 Discriminator
- 4.3 Minimax objective
- 4.4 Mode collapse
- 4.5 No direct likelihood
- Normalizing Flows
- 5.1 Invertible map
- 5.2 Change of variables
- 5.3 Exact likelihood
- 5.4 Architectural constraint
- 5.5 Sampling
- Diffusion Models
- 6.1 Forward noising
- 6.2 Closed-form noising
- 6.3 Denoising model
- 6.4 Training objective
- 6.5 Reverse sampling
- Score-Based View
- 7.1 Score
- 7.2 Denoising score matching
- 7.3 Langevin sampling
- 7.4 SDE view
- 7.5 Guidance
- Evaluation
- 8.1 Log likelihood
- 8.2 Sample quality
- 8.3 Diversity
- 8.4 FID intuition
- 8.5 Precision recall
- Applications and Tradeoffs
- 9.1 Text
- 9.2 Images
- 9.3 Representation learning
- 9.4 Data augmentation
- 9.5 Safety and misuse
- Diagnostics
- 10.1 Likelihood versus samples
- 10.2 Latent traversals
- 10.3 Mode coverage
- 10.4 Denoising curves
- 10.5 Ablations
Model Family Map
| Family | Likelihood | Sampling | Main strength | Main cost |
|---|---|---|---|---|
| Autoregressive | Exact | Sequential | Strong likelihood and text generation | Serial generation |
| VAE | Lower bound | Fast | Latent representation | Blurry samples if decoder weak |
| GAN | Usually unavailable | Fast | Sharp samples | Unstable training, mode collapse |
| Flow | Exact | Fast-ish | Exact density and invertibility | Architecture constraints |
| Diffusion | Variational/score view | Iterative | High sample quality and controllability | Many denoising steps |
1. Generative Modeling Goal
This part studies generative modeling goal as ways to model, sample, and evaluate data distributions.
| Subtopic | Question | Formula |
|---|---|---|
| Data distribution | learn a model of how examples are generated | |
| Model distribution | choose a parametric distribution family | |
| Sampling | generate new examples from the model | |
| Likelihood | some models assign exact or approximate probabilities | |
| Conditional generation | generate using labels, prompts, or context |
1.1 Data distribution
Main idea. Learn a model of how examples are generated.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
1.2 Model distribution
Main idea. Choose a parametric distribution family.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
1.3 Sampling
Main idea. Generate new examples from the model.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
1.4 Likelihood
Main idea. Some models assign exact or approximate probabilities.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
1.5 Conditional generation
Main idea. Generate using labels, prompts, or context.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
2. Autoregressive Models
This part studies autoregressive models as ways to model, sample, and evaluate data distributions.
| Subtopic | Question | Formula |
|---|---|---|
| Chain rule | factorize data into conditional predictions | |
| Teacher forcing | train conditionals with true previous tokens | |
| Sampling loop | generate one variable at a time | |
| Exact likelihood | log likelihood is a sum of conditional log probabilities | |
| Cost | generation is sequential in the generated dimension | serial steps |
2.1 Chain rule
Main idea. Factorize data into conditional predictions.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
2.2 Teacher forcing
Main idea. Train conditionals with true previous tokens.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
2.3 Sampling loop
Main idea. Generate one variable at a time.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
2.4 Exact likelihood
Main idea. Log likelihood is a sum of conditional log probabilities.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
2.5 Cost
Main idea. Generation is sequential in the generated dimension.
Core relation:
O(T)$ serial stepsGenerative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
3. Variational Autoencoders
This part studies variational autoencoders as ways to model, sample, and evaluate data distributions.
| Subtopic | Question | Formula |
|---|---|---|
| Latent variable model | generate data from a latent code | |
| Encoder | approximate posterior over latent variables | |
| Decoder | map latent samples to data distribution | |
| ELBO | optimize a tractable lower bound | |
| Reparameterization | sample differentiably from a Gaussian posterior |
3.1 Latent variable model
Main idea. Generate data from a latent code.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
3.2 Encoder
Main idea. Approximate posterior over latent variables.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
3.3 Decoder
Main idea. Map latent samples to data distribution.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
3.4 ELBO
Main idea. Optimize a tractable lower bound.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is the bridge from probabilistic latent variables to trainable variational autoencoders.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
3.5 Reparameterization
Main idea. Sample differentiably from a gaussian posterior.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
4. GANs
This part studies gans as ways to model, sample, and evaluate data distributions.
| Subtopic | Question | Formula |
|---|---|---|
| Generator | map noise to synthetic samples | |
| Discriminator | score whether samples look real | |
| Minimax objective | train generator and discriminator adversarially | |
| Mode collapse | generator may cover only part of the data distribution | misses modes |
| No direct likelihood | standard GANs sample well but do not provide easy likelihoods | unavailable |
4.1 Generator
Main idea. Map noise to synthetic samples.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
4.2 Discriminator
Main idea. Score whether samples look real.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
4.3 Minimax objective
Main idea. Train generator and discriminator adversarially.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. GANs trade likelihood for an adversarial learning signal that can produce sharp samples.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
4.4 Mode collapse
Main idea. Generator may cover only part of the data distribution.
Core relation:
p_G$ misses modesGenerative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
4.5 No direct likelihood
Main idea. Standard gans sample well but do not provide easy likelihoods.
Core relation:
\log p_G(x)$ unavailableGenerative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
5. Normalizing Flows
This part studies normalizing flows as ways to model, sample, and evaluate data distributions.
| Subtopic | Question | Formula |
|---|---|---|
| Invertible map | transform simple noise into data with an invertible function | |
| Change of variables | density uses Jacobian determinant | $\log p_X(x)=\log p_Z(z)+\log |
| Exact likelihood | flows can train by maximum likelihood | |
| Architectural constraint | layers must be invertible and have tractable Jacobians | tractable |
| Sampling | sample z then apply f |
5.1 Invertible map
Main idea. Transform simple noise into data with an invertible function.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
5.2 Change of variables
Main idea. Density uses jacobian determinant.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. Flows are powerful because they keep exact likelihood through invertible maps.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
5.3 Exact likelihood
Main idea. Flows can train by maximum likelihood.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
5.4 Architectural constraint
Main idea. Layers must be invertible and have tractable jacobians.
Core relation:
\det J$ tractableGenerative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
5.5 Sampling
Main idea. Sample z then apply f.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
6. Diffusion Models
This part studies diffusion models as ways to model, sample, and evaluate data distributions.
| Subtopic | Question | Formula |
|---|---|---|
| Forward noising | gradually corrupt data with Gaussian noise | |
| Closed-form noising | sample noisy x_t directly from x_0 | |
| Denoising model | learn to predict noise or clean data | |
| Training objective | common DDPM loss predicts added noise | |
| Reverse sampling | start from noise and denoise step by step |
6.1 Forward noising
Main idea. Gradually corrupt data with gaussian noise.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
6.2 Closed-form noising
Main idea. Sample noisy x_t directly from x_0.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
6.3 Denoising model
Main idea. Learn to predict noise or clean data.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
6.4 Training objective
Main idea. Common ddpm loss predicts added noise.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. Diffusion training often becomes a supervised denoising problem.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
6.5 Reverse sampling
Main idea. Start from noise and denoise step by step.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
7. Score-Based View
This part studies score-based view as ways to model, sample, and evaluate data distributions.
| Subtopic | Question | Formula |
|---|---|---|
| Score | gradient of log density with respect to data | |
| Denoising score matching | learn scores from noisy samples | |
| Langevin sampling | move samples toward high-density regions plus noise | |
| SDE view | continuous-time noising and denoising processes | |
| Guidance | condition generation by modifying scores or logits |
7.1 Score
Main idea. Gradient of log density with respect to data.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
7.2 Denoising score matching
Main idea. Learn scores from noisy samples.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
7.3 Langevin sampling
Main idea. Move samples toward high-density regions plus noise.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
7.4 SDE view
Main idea. Continuous-time noising and denoising processes.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
7.5 Guidance
Main idea. Condition generation by modifying scores or logits.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. Guidance is one reason conditional diffusion models can trade diversity for prompt adherence.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
8. Evaluation
This part studies evaluation as ways to model, sample, and evaluate data distributions.
| Subtopic | Question | Formula |
|---|---|---|
| Log likelihood | evaluate density when tractable | |
| Sample quality | visual or task-specific quality of generated examples | |
| Diversity | model should cover data modes | |
| FID intuition | compare feature means and covariances | |
| Precision recall | separate sample fidelity from mode coverage |
8.1 Log likelihood
Main idea. Evaluate density when tractable.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
8.2 Sample quality
Main idea. Visual or task-specific quality of generated examples.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
8.3 Diversity
Main idea. Model should cover data modes.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
8.4 FID intuition
Main idea. Compare feature means and covariances.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
8.5 Precision recall
Main idea. Separate sample fidelity from mode coverage.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
9. Applications and Tradeoffs
This part studies applications and tradeoffs as ways to model, sample, and evaluate data distributions.
| Subtopic | Question | Formula |
|---|---|---|
| Text | autoregressive LMs dominate discrete sequence generation | |
| Images | diffusion and autoregressive models are common high-quality image generators | |
| Representation learning | VAEs learn latent spaces | |
| Data augmentation | synthetic samples can help or hurt depending on quality | |
| Safety and misuse | generation systems need provenance, filtering, and evaluation |
9.1 Text
Main idea. Autoregressive lms dominate discrete sequence generation.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
9.2 Images
Main idea. Diffusion and autoregressive models are common high-quality image generators.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
9.3 Representation learning
Main idea. Vaes learn latent spaces.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
9.4 Data augmentation
Main idea. Synthetic samples can help or hurt depending on quality.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
9.5 Safety and misuse
Main idea. Generation systems need provenance, filtering, and evaluation.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
10. Diagnostics
This part studies diagnostics as ways to model, sample, and evaluate data distributions.
| Subtopic | Question | Formula |
|---|---|---|
| Likelihood versus samples | good likelihood and good samples do not always align | versus visual quality |
| Latent traversals | inspect smoothness and meaning of latent directions | |
| Mode coverage | check diversity and rare classes | |
| Denoising curves | track loss by timestep | |
| Ablations | compare architecture, objective, guidance, sampling steps, and conditioning |
10.1 Likelihood versus samples
Main idea. Good likelihood and good samples do not always align.
Core relation:
\log p$ versus visual qualityGenerative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
10.2 Latent traversals
Main idea. Inspect smoothness and meaning of latent directions.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
10.3 Mode coverage
Main idea. Check diversity and rare classes.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
10.4 Denoising curves
Main idea. Track loss by timestep.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
10.5 Ablations
Main idea. Compare architecture, objective, guidance, sampling steps, and conditioning.
Core relation:
Generative models differ in what they make easy. Autoregressive models make likelihood straightforward but sampling serial. VAEs make latent variables explicit but optimize a bound. GANs can sample sharply but lack direct likelihood. Flows give exact likelihood but require invertible architectures. Diffusion models train by denoising and sample through iterative refinement.
Worked micro-example. In a Gaussian VAE encoder, . Instead of sampling as an opaque random variable, write with . This keeps randomness outside the parameters and lets gradients flow through and .
Implementation check. Always say what is being optimized: exact likelihood, ELBO, adversarial loss, score matching, or denoising loss. Different objectives imply different diagnostics.
AI connection. This is a practical generative-model control variable.
Common mistake. Do not compare generative models only by one metric. Likelihood, visual quality, diversity, controllability, sampling cost, and safety behavior can disagree.
Practice Exercises
- Compute autoregressive log likelihood.
- Compute a VAE Gaussian KL to a standard normal prior.
- Apply the reparameterization trick.
- Compute GAN discriminator and generator losses.
- Apply a 1D flow change-of-variables formula.
- Simulate one diffusion noising step.
- Compute a denoising MSE loss.
- Take one score-based Langevin update.
- Compute a simplified FID-style distance.
- Write a generative-model debugging checklist.
Why This Matters for AI
Modern AI is largely generative: LLMs generate text, diffusion models generate images, VAEs and flows model latent structure, and synthetic data systems generate training examples. Understanding the objective behind each model prevents shallow comparisons.
Bridge to CNN and Convolution Math
Image generators often use convolutional or attention-based backbones. The next section studies convolution math, a key building block for many vision generators and discriminators.
References
- Diederik Kingma and Max Welling, "Auto-Encoding Variational Bayes", 2013: https://arxiv.org/abs/1312.6114
- Ian Goodfellow et al., "Generative Adversarial Nets", 2014: https://papers.nips.cc/paper/5423-generative-adversarial-nets
- Jonathan Ho, Ajay Jain, and Pieter Abbeel, "Denoising Diffusion Probabilistic Models", 2020: https://arxiv.org/abs/2006.11239
- Yang Song et al., "Score-Based Generative Modeling through Stochastic Differential Equations", 2020: https://arxiv.org/abs/2011.13456
- Durk Kingma and Prafulla Dhariwal, "Glow: Generative Flow with Invertible 1x1 Convolutions", 2018: https://arxiv.org/abs/1807.03039