\text{Diffusion Models}
\textbf{Naresh Kumar Devulapally}
\text{CSE 4/573: Computer Vision and Image Processing}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Lecture 15, 16, 17: July 22, 24, 29, 2025}

CVIP 2.0

  • Recap of the Generative AI model architectures.
  • The bigger picture in all generative models.
  • The bottleneck in Variational AutoEncoders.
  • Diffusion Models (Recent Models).
  • Forward Diffusion Process
  • Reverse Diffusion Process
  • Training Architecture
  • Coding Example

\( \text{Agenda of this Lecture:}\)

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{VAEs - Recap}
P(x \mid z)
P(z \mid x)

Posterior

Generative Model

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Conditional Variational AutoEncoders}
P(x \mid z)
P(z \mid x)

Posterior

Generative Model

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Conditional Variational AutoEncoders}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Generative Adversarial Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{VAEs to Diffusion Models}

Data reconstruction using VAEs

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models}
  • Forward Process
  • Reverse Process

Diffusion Models

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models}
\text{Diffusion Models - Part 2}
\textbf{Naresh Kumar Devulapally}
\text{CSE 4/573: Computer Vision and Image Processing}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Lecture 16: July 24, 2025}

CVIP 2.0

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Expectation

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Jensen's Inequality

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

KL Divergence

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

VAE Loss

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Notation

  • \( x_0 \) : Original data sample (image, audio, etc.)
  • \( x_t \): Noised version of \( x_0 \) at timestep \( t \)
  • \( x_T \): Final noise, ideally standard Gaussian
  • \( \beta_t \): Variance schedule, determines noise magnitude at step \( t \)
  • \( \alpha_t = 1 - \beta_t,\quad \bar{\alpha}_t = \prod_{s=1}^t \alpha_s \)
  • Noise is added as:
x_t = \sqrt{\bar{\alpha}_t} x_0 + \sqrt{1 - \bar{\alpha}_t} \, \epsilon,\quad \epsilon \sim \mathcal{N}(0, I)
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

What does \( \beta_t \) do?

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Forward and Reverse Processes

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Forward Process

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Forward Process

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Reverse Process

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Diffusion Loss

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Diffusion Loss

Calvin Luo's

Diffusion Tutorial

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Diffusion Loss

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Diffusion Loss

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Reverse Process Update

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

Summary

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 2}

References

\text{Diffusion Models - Part 3}
\textbf{Naresh Kumar Devulapally}
\text{CSE 4/573: Computer Vision and Image Processing}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Lecture 17: July 29, 2025}

CVIP 2.0

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models - Part 3}
  • Recap of the VAE Architecture
  • Recap of the Pixel Level Diffusion Model
  • Conditional Diffusion Model
  • Classifier Guidance v/s Classifier Free Guidance
  • Why Latent Diffusion Models?
  • Latent Diffusion Models (LDMs) explained
  • Cross Attention in LDMs.
  • Diffusion Models for various Computer Vision tasks.
  • Tips to complete capstone project milestone 2.
  • Information about Guest Talk on July 31 2025.

\( \text{Agenda of this Lecture:}\)

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{VAEs v/s Diffusion Models}

Gaussian Variable

Gaussian Variable

\( \mathcal{L}_{\text{VAE}}  = \text{Reconstruction} + \text{Prior Matching} \)

\( \mathcal{L}_{\text{Diff}}  = \text{Reconstruction} + \text{Prior Matching} + \text{Noise Matching} \)

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{VAEs v/s Diffusion Models}

Gaussian Variable

Gaussian Variable

\( \mathcal{L}_{\text{VAE}}  = \text{Reconstruction} + \text{Prior Matching} \)

\( \mathcal{L}_{\text{Diffusion-Training}}  = \text{Noise Matching} \)

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Diffusion Models}

Unconditional Image Generation

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Conditional Img. Gen. in Diffusion Models}

Classifier Guidance

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Conditional Img. Gen. in Diffusion Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{What is the Diffusion Model?}

UNet

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{What is the Diffusion Model?}

UNet

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Latent Diffusion Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Latent Diffusion Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Latent Diffusion Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Latent Diffusion Models}
\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Latent Diffusion Models}

Cross Attention Maps

\text{July 10, 2025}
\text{Naresh Kumar Devulapally}
\text{CSE 4/573: CVIP, Summer 2025}
\text{July 22, 24, 29, 2025}
\text{Latent Diffusion Models}

Cross Attention Maps for Editing

Lectures 15,16,17: Diffusion Models

By Naresh Kumar Devulapally

Lectures 15,16,17: Diffusion Models

  • 220