Convolutional Neural Networks

Shen Shen

March 22, 2022

Recap: fully-connected networks

layered structure
fully-connected neurons
dot-product
nonlinear activation
back-propagation
gradient descent
code one up in python

[image credit: 3b1b]

convolutional neural networks

Why do we need a special network for images?
Why is CNN (the) special network for images?

9

Why do we

need a specialized net for images?

[video credit: 3b1b]

Q: Why do we need a specialized network?

426-by-426 grayscale image

Use the same small network?

need to learn ~3M parameters

Imagine even higher-resolution images, or more complex tasks...

A: Vanilla fully-connected nets don't scale well to (interesting) images

Why do we think

9?

is

Why do we think

9?

is

Visual hierarchy

Spatial locality

Translational invariance

Visual hierarchy

Spatial locality

Translational invariance

CNN cleverly exploits

to handle images efficiently

via

(deep) layers
convolution
pooling

CNN cleverly exploits

to handle images efficiently

via

(deep) layers
convolution
pooling

Visual hierarchy

Spatial locality

Translational invariance

cleverly exploits

to handle efficiently

via

Let's move on to the board

to see what a 1D convolution looks like ...

Convolutional layer might sound foreign, but...

Let's move on to the board to see what a 1D convolution looks like ...

Convolutional layer might sound foreign, but...

(and there's even deeper 'similarities', but more on it later)

-1

input image

filter

output image

-1

(0*-1)+(1*1)=1

(1*-1)+(0*1)=-1

(0*-1)+(1*1)=1

(1*-1)+(1*1)=0

Hyper-parameters

Zero-padding

Receptive field ('window' size, e.g., a different size here)

Stride

-1

(e.g. of size 2)

-1

input image

filter

output image

-1

(0*-1)+(1*1)=1

'look' locally
parameter sharing

-1

convolve

with

-1

or dot

with

convolve with ?

dot-product with ?

convolve with

dot-product with

I_{5\times5}

-1

input image

filter

output image

-1

(0*-1)+(1*1)=1

(1*-1)+(0*1)=-1

(0*-1)+(1*1)=1

(1*-1)+(1*1)=0

2D Convolution

input image

filter

output image

-1

0	1	-1	0	0
1	1	0	1	-1
1	2	1	0	0
0	2	1	1	0
0	1	1	0	0

A tender intro to tensor:

[image credit: tensorflow]

[Photo by Zayn Shah on Unsplash]

red

green

blue

[Photo by Zayn Shah on Unsplash]

image channels

image width

image

height

image channels

image width

image

height

input tensor

filters

outputs

\dots

input tensor

filters

output tensor

\dots

[image credit: medium]

Parting thoughts

On-going research
- Other network architectures, e.g.,

Parting thoughts

On-going research
- Other network architectures
- Interpretation of features learned, e.g.,
  - distill.pub

Parting thoughts

On-going research
- Other network architectures
- Interpretation of features learned
- Adversarial attacks and robustness

voter	sex	age	income	education	...	voted
1	F	30	70k	post-grad	...	Bernie
2	M	75	80k	college	...	Biden

...

[HW credit: Princeton ORF363]

don't abuse convolution...

w_1

w_2

Thank you.

Questions?

CNN

By Shen Shen

CNN

Shen Shen

shenshen.mit.edu

Convolutional Neural Networks

Recap: fully-connected networks

convolutional neural networks

Why do we need a special network for images?

Why is CNN (the) special network for images?

9

Why do we

need a specialized net for images?

Why do we think

9?

is

Why do we think

9?

is

Hyper-parameters

Parting thoughts

Parting thoughts

Parting thoughts

Thank you.

Questions?

CNN

More from Shen Shen