Long Short Term Memory
Kamal Nandan | Laurens ten Cate | Gianluca Sabbatucci Richard Rohla | Veronique Wang |Alex Kyalo | Asier Sarrasua
Why not just Feed-forward network?
Output at t
Output at t-1
no relation
Sequence issues w/ Feed-forward NN
-
Classification of movie-scenes.
-
Understanding the context of a book paragraph.
-
Prediction of the next word in a sentence.
-
Time-series forecasting.
RNN comes to the rescue
RNNs
Short term dependency:
E.g. The color of the sky is ___.
Limitations of RNNs
Long term dependency:
E.g. I am from India. I studied in Spain for 5 long years and then I moved to France for work.
.... I can speak fluent _____.
Back-propagation through time
Vanishing Gradient issue
also: Exploding Gradient issue
Solutions
Vanishing Gradient
- LSTM
- GRU
Exploding Gradient
- Gradient Clipping
Long Short-Term Memory
Hochreiter & Schmidhuber (1997)
RNN have single layer structure
Long Short-Term Memory
Hochreiter & Schmidhuber (1997)
LSTM has four layer structure
Cell State
Cell State allows LSTM to remember information
through regulated gates
Element-wise product
Sigmoid layer
Gates are a way to optionally let information through
Gate 1: Forget Gate
Forget Gate regulates what the network "forgets" from the cell state
Gate 2: Input Gate
- Input Gate decides what values to update (sigmoid layer)
- tanh layer creates new "candidate values" to be added to the cell-state (scaled by input gate)
Cell state interaction
The forget gate interacts with the cell state (ft * Ct-1)
The input gate interacts with the cell state (it * Ct)
Gate 3: Output Gate
Output gate decides what will be pushed to the next cell in the sequence from the cell-state
LSTM vs GRU
Speed vs Complexity
Testing on the IMDB dataset
1x 32node SimpleRNN layer
LSTM vs GRU
Speed vs Complexity
Testing on the IMDB dataset
1x 32node LSTM layer
LSTM vs GRU
Speed vs Complexity
Testing on the IMDB dataset
1x 32node GRU layer
Long Short Term Memory
Kamal Nandan | Laurens ten Cate | Gianluca Sabbatucci Richard Rohla | Veronique Wang |Alex Kyalo | Asier Sarrasua
ML - LSTM
By laurenstc
ML - LSTM
- 1,135