PERCEPTRONS BY HAND

Andrew Beam, PhD

Department of Epidemiology

HSPH

twitter: @AndrewLBeam

PERCEPTRONS

Let's say we'd like to have a single neuron learn a function

X_1

X_2

X1	X2	y
0	0	0
0	1	1
1	0	1
1	1	1

w_2

w_1

Observations

PERCEPTRONS

How do we make a prediction for each observations?

X_1

X_2

X1	X2	y
0	0	0
0	1	1
1	0	1
1	1	1

w_2

w_1

Assume we have the following values

w1	w2	b
1	-1	-0.5

Observations

Predictions

For the first observation:

Assume we have the following values

w1	w2	b
1	-1	-0.5

X_1 = 0, X_2 = 0, y =0

Predictions

For the first observation:

Assume we have the following values

w1	w2	b
1	-1	-0.5

X_1 = 0, X_2 = 0, y =0

First compute the weighted sum:

h = w_1*X_1 + w_2*X_2 + b

h = 1*0 + -1*0 + -0.5 = -0.5

h = -0.5

Predictions

For the first observation:

Assume we have the following values

w1	w2	b
1	-1	-0.5

X_1 = 0, X_2 = 0, y =0

First compute the weighted sum:

h = w_1*X_1 + w_2*X_2 + b

h = 1*0 + -1*0 + -0.5

h = -0.5

Transform to probability:

p = \frac{1}{1+\exp(-h)}

p = \frac{1}{1+\exp(-0.5)}

p = 0.38

Predictions

For the first observation:

Assume we have the following values

w1	w2	b
1	-1	-0.5

X_1 = 0, X_2 = 0, y =0

First compute the weighted sum:

h = w_1*X_1 + w_2*X_2 + b

h = 1*0 + -1*0 + -0.5

h = -0.5

Transform to probability:

p = \frac{1}{1+\exp(-h)}

p = \frac{1}{1+\exp(-0.5)}

p = 0.38

Round to get prediction:

\hat{y} = round(p)

\hat{y} = 0

Predictions

Putting it all together:

h = w_1*X_1 + w_2*X_2 + b

p = \frac{1}{1+\exp(-h)}

\hat{y} = round(p)

Assume we have the following values

w1	w2	b
1	-1	-0.5

X1	X2	y	h	p
0	0	0	-0.5	0.38	0
0	1	1
1	0	1
1	1	1

\hat{y}

Fill out this table

Predictions

Putting it all together:

h = w_1*X_1 + w_2*X_2 + b

p = \frac{1}{1+\exp(-h)}

\hat{y} = round(p)

Assume we have the following values

w1	w2	b
1	-1	-0.5

X1	X2	y	h	p
0	0	0	-0.5	0.38	0
0	1	1	-1.5	0.18	0
1	0	1	0.5	0.62	1
1	1	1	-0.5	0.38	0

\hat{y}

Fill out this table

Room for Improvement

Our neural net isn't so great... how do we make it better?

What do I even mean by better?

Room for Improvement

Let's define how we want to measure the network's performance.

There are many ways, but let's use squared-error:

(y - p)^2

Room for Improvement

Let's define how we want to measure the network's performance.

There are many ways, but let's use squared-error:

Now we need to find values for that make this error as small as possible

(y - p)^2

w_1, w_2, b

ALL OF ML IN ONE SLIDE

Our task is learning values for such the the difference between the predicted and actual values is as small as possible.

w_1, w_2, b

Learning from Data

So, how we find the "best" values for

w_1, w_2, b

Learning from Data

Recall (without PTSD) that the derivative of a function tells you how it is changing at any given location.

If the derivative is positive, it means it's going up.

If the derivative is negative, it means it's going down.

Learning from Data

Simple strategy:

- Start with initial values for

- Take partial derivatives of loss function

with respect to

- Subtract the derivative (also called the gradient) from each

w_1, w_2, b

Learning from Data

Simple strategy:

- Start with initial values for

- Take partial derivatives of loss function

with respect to

- Subtract the derivative (also called the gradient) from each

w_1, w_2, b

To the whiteboard!

THE BACKPROPAGATION ALGORITHM

Learning Rules for each Parameter

gw_1 = (p - y)*(p*(1-p)*X_1)

gw_2 = (p - y)*(p*(1-p)*X_2)

g_b = (p - y)*(p*(1-p))

Gradient for

w^{new}_1 = w^{old}_1 - \sum gw_1

Update for

w^{new}_2 = w^{old}_2 - \sum gw_2

b^{new} = b^{old} - \sum g_b

w_1

w_2

BMI 707 / EPI 290: Perceptrons by hand (2022)

By beamandrew

BMI 707 / EPI 290: Perceptrons by hand (2022)

1,133

PERCEPTRONS BY HAND

PERCEPTRONS

PERCEPTRONS

Predictions

Predictions

Predictions

Predictions

Predictions

Predictions

Room for Improvement

Room for Improvement

Room for Improvement

ALL OF ML IN ONE SLIDE

Learning from Data

Learning from Data

Learning from Data

Learning from Data

THE BACKPROPAGATION ALGORITHM

Learning Rules for each Parameter

BMI 707 / EPI 290: Perceptrons by hand (2022)

More from beamandrew