\text{Diffusion Models}

\textbf{Naresh Kumar Devulapally}

\text{CSE 4/573: Computer Vision and Image Processing}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Lecture 15, 16, 17: July 22, 24, 29, 2025}

CVIP 2.0

Recap of the Generative AI model architectures.
The bigger picture in all generative models.
The bottleneck in Variational AutoEncoders.
Diffusion Models (Recent Models).
Forward Diffusion Process
Reverse Diffusion Process
Training Architecture
Coding Example

\( \text{Agenda of this Lecture:}\)

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{VAEs - Recap}

P(x \mid z)

P(z \mid x)

Posterior

Generative Model

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Conditional Variational AutoEncoders}

P(x \mid z)

P(z \mid x)

Posterior

Generative Model

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Conditional Variational AutoEncoders}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Generative Adversarial Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{VAEs to Diffusion Models}

Data reconstruction using VAEs

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models}

Forward Process
Reverse Process

Diffusion Models

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models}

\text{Diffusion Models - Part 2}

\textbf{Naresh Kumar Devulapally}

\text{CSE 4/573: Computer Vision and Image Processing}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Lecture 16: July 24, 2025}

CVIP 2.0

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Expectation

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Jensen's Inequality

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

KL Divergence

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

VAE Loss

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Notation

\( x_0 \) : Original data sample (image, audio, etc.)
\( x_t \): Noised version of \( x_0 \) at timestep \( t \)
\( x_T \): Final noise, ideally standard Gaussian
\( \beta_t \): Variance schedule, determines noise magnitude at step \( t \)
\( \alpha_t = 1 - \beta_t,\quad \bar{\alpha}_t = \prod_{s=1}^t \alpha_s \)
Noise is added as:

x_t = \sqrt{\bar{\alpha}_t} x_0 + \sqrt{1 - \bar{\alpha}_t} \, \epsilon,\quad \epsilon \sim \mathcal{N}(0, I)

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

What does \( \beta_t \) do?

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Forward and Reverse Processes

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Forward Process

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Forward Process

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Reverse Process

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Diffusion Loss

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Diffusion Loss

Calvin Luo's

Diffusion Tutorial

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Diffusion Loss

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Diffusion Loss

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Reverse Process Update

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

Summary

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 2}

References

\text{Diffusion Models - Part 3}

\textbf{Naresh Kumar Devulapally}

\text{CSE 4/573: Computer Vision and Image Processing}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Lecture 17: July 29, 2025}

CVIP 2.0

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models - Part 3}

Recap of the VAE Architecture
Recap of the Pixel Level Diffusion Model
Conditional Diffusion Model
Classifier Guidance v/s Classifier Free Guidance
Why Latent Diffusion Models?
Latent Diffusion Models (LDMs) explained
Cross Attention in LDMs.
Diffusion Models for various Computer Vision tasks.
Tips to complete capstone project milestone 2.
Information about Guest Talk on July 31 2025.

\( \text{Agenda of this Lecture:}\)

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{VAEs v/s Diffusion Models}

Gaussian Variable

\( \mathcal{L}_{\text{VAE}} = \text{Reconstruction} + \text{Prior Matching} \)

\( \mathcal{L}_{\text{Diff}} = \text{Reconstruction} + \text{Prior Matching} + \text{Noise Matching} \)

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{VAEs v/s Diffusion Models}

Gaussian Variable

\( \mathcal{L}_{\text{VAE}} = \text{Reconstruction} + \text{Prior Matching} \)

\( \mathcal{L}_{\text{Diffusion-Training}} = \text{Noise Matching} \)

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Diffusion Models}

Unconditional Image Generation

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Conditional Img. Gen. in Diffusion Models}

Classifier Guidance

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Conditional Img. Gen. in Diffusion Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{What is the Diffusion Model?}

UNet

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{What is the Diffusion Model?}

UNet

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Latent Diffusion Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Latent Diffusion Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Latent Diffusion Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Latent Diffusion Models}

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Latent Diffusion Models}

Cross Attention Maps

\text{July 10, 2025}

\text{Naresh Kumar Devulapally}

\text{CSE 4/573: CVIP, Summer 2025}

\text{July 22, 24, 29, 2025}

\text{Latent Diffusion Models}

Cross Attention Maps for Editing

Lectures 15,16,17: Diffusion Models

By Naresh Kumar Devulapally

Lectures 15,16,17: Diffusion Models

Lectures 15,16,17: Diffusion Models

More from Naresh Kumar Devulapally