Long Short-Term Memory (LSTM)

X

X

+

X

Feed-forward neural network

input layer

hidden layer

output layer

Feed-Forward Neural Network

Recurrent Neural Network

Mon

Tues

Wed

Thurs

Fri

Recurrent Neural Network

Mon

Tues

Wed

Thurs

Fri

O1

Recurrent Neural Network

Mon

Tues

Wed

Thurs

Fri

O1

O2

Recurrent Neural Network

Mon

Tues

Wed

Thurs

Fri

O1

O2

O3

Recurrent Neural Network

Mon

Tues

Wed

Thurs

Fri

O1

O2

O3

O4

Recurrent Neural Network

Mon

Tues

Wed

Thurs

Fri

O1

O2

O3

O4

O5

Recurrent Neural Network

Mon

Tues

Wed

Thurs

Fri

O1

O2

O3

O4

O5

output

Recurrent Neural Network

Recurrent Neural Network

short-term memory!

Final RNN hidden state

Vanishing Gradient Problem!

Final RNN hidden state

Forward Propagation

Forward Propagation

Forward Propagation

Forward Propagation

Forward Propagation

Forward Propagation

Forward Propagation

Forward Propagation

Error Estimation

Forward Propagation

Back Propagation

Error Estimation

Forward Propagation

Back Propagation

Error Estimation

Forward Propagation

Back Propagation

Error Estimation

Forward Propagation

Back Propagation

Error Estimation

Repeat!

Back Propagation

∇

Vanishing Gradient Problem!

∇

Back Propagation

Vanishing Gradient Problem!

∇

Back Propagation

Vanishing Gradient Problem!

∇

Back Propagation

Vanishing Gradient Problem!

∇

Back Propagation

Vanishing Gradient Problem!

∇

Back Propagation

Vanishing Gradient Problem!

∇

Back Propagation

Vanishing Gradient Problem!

∇

Back Propagation

Vanishing Gradient Problem!

∇

Back Propagation

Vanishing Gradient Problem!

∇

Back Propagation

Vanishing Gradient Problem!

∇

Back Propagation

Vanishing Gradient Problem!

1

2 

3

forget gate

forget irrelevant information

input gate

update gate

add/update new information

pass updated information

LSTM Cell

Long Short-Term Memory (LSTM)

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ht-1

xt

ct-1

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ht-1

xt

ct-1

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ft

ht-1

xt

ct-1

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ht-1

xt

ft

ct-1

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ht-1

xt

it

ft

ct-1

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ht-1

xt

ft

ct-1

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ht-1

xt

ft

ct-1

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ht-1

xt

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ht-1

xt

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ht-1

xt

ct

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ht-1

xt

ct

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ct

ot

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ct

ot

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ct

ht

Long Short-Term Memory (LSTM)

X

X

+

forget gate

input gate

update gate

sigmoid

tanh

X

X

pointwise multiplication

+

pointwise addition

vector concatenation

ct

ht

Long Short-Term Memory (LSTM)

By Farid Qamar

Long Short-Term Memory (LSTM)

  • 302