Long Short Term Memory

Kamal Nandan | Laurens ten Cate | Gianluca Sabbatucci Richard Rohla | Veronique Wang |Alex Kyalo | Asier Sarrasua

Why not just Feed-forward network?

Output at t

Output at t-1

no relation

Sequence issues w/ Feed-forward NN

  • Classification of movie-scenes.

  • Understanding the context of a book paragraph.

  • Prediction of the next word in a sentence.

  • Time-series forecasting.

RNN comes to the rescue

RNNs

Short term dependency:

E.g. The color of the sky is ___.

Limitations of RNNs

Long term dependency:
E.g. I am from India. I studied in Spain for 5 long years and then I moved to France for work.
.... I can speak fluent _____.

Back-propagation through time 

Vanishing Gradient issue

also: Exploding Gradient issue

Solutions

Vanishing Gradient

- LSTM

- GRU

Exploding Gradient

- Gradient Clipping

Long Short-Term Memory

Hochreiter & Schmidhuber (1997)

RNN have single layer structure

Long Short-Term Memory

Hochreiter & Schmidhuber (1997)

LSTM has four layer structure

Cell State

Cell State allows LSTM to remember information

through regulated gates

 

Element-wise product

Sigmoid layer

Gates are a way to optionally let information through 

Gate 1: Forget Gate

Forget Gate regulates what the network "forgets" from the cell state

Gate 2: Input Gate

- Input Gate decides what values to update (sigmoid layer)

- tanh layer creates new "candidate values" to be added to the cell-state (scaled by input gate)

Cell state interaction

The forget gate interacts with the cell state (ft * Ct-1)

 

The input gate interacts with the cell state (it * Ct)

Gate 3: Output Gate

Output gate decides what will be pushed to the next cell in the sequence from the cell-state

LSTM vs GRU

Speed vs Complexity

Testing on the IMDB dataset

1x 32node SimpleRNN layer

LSTM vs GRU

Speed vs Complexity

Testing on the IMDB dataset

1x 32node LSTM layer 

LSTM vs GRU

Speed vs Complexity

Testing on the IMDB dataset

1x 32node GRU layer 

Long Short Term Memory

Kamal Nandan | Laurens ten Cate | Gianluca Sabbatucci Richard Rohla | Veronique Wang |Alex Kyalo | Asier Sarrasua

ML - LSTM

By laurenstc

ML - LSTM

  • 1,135