Text
Text
Introduction to LSTM
Chapter 1 | Recurrent NN
Chapter 1 | Recurrent NN
Chapter | Recurrent NN
Chapter | LSTM
Use cases
Chapter | LSTM
Use cases
Chapter | LSTM
Chapter 1 | Recurrent NN
Chapter 1 | Recurrent NN
Chapter | Reccurent NN
Chapter | Reccurent NN
Chapter | LSTM
Chapter | LSTM
Some Forgotten Some Remembered
Chapter | LSTM
Entire Neural Net Dedicated to
forgetting/Remembering
Chapter | LSTM
Chapter | LSTM
A selection that has its own neural network
to filter out the memory used in the prediction
Chapter | LSTM
Chapter | LSTM
Ignore Irrelevance
Chapter | LSTM
Ignore Irrelevance
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Jane saw Spot. <PREDICTION>
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Jane saw Spot. Doug <PREDICTION>
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
Chapter | LSTM
Ignore Irrelevance
Practical Example: Sentence Prediction
LSTM's can look back many time steps and has proven that successfully
Chapter | LSTM
Chapter | LSTM
Airline Industry Problem-
Predict the number
GO TO Interactive Course on https://diggit.no
Chapter | LSTM
import numpy
import matplotlib.pyplot as plt
import pandas
import math
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_errorMODULES
Chapter | LSTM
# fix random seed for reproducibility
numpy.random.seed(7)Random Number Seed
Chapter | LSTM
# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)LSTMs are sensitive to the scale of the input data
Chapter | LSTM
# split into train and test sets
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size], dataset[train_size:len(dataset)]
print(len(train), len(test))67% training 33% testing
Chapter | LSTM
>>> a=[1,2,3,4,5]
>>> a[1:3]
[2, 3]
>>> a[:3]
[1, 2, 3]
>>> a[2:]
[3, 4, 5]What was that colon operator?
Chapter | LSTM
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 0])
return numpy.array(dataX), numpy.array(dataY)look_back, which is the number of previous time steps to use as input variables to predict the next time period
Chapter | LSTM
# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1]))The LSTM network expects the input data (X) to be provided with a specific array structure in the form of: [samples, time steps, features].
Chapter | LSTM
# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1]))Currently, our data is in the form: [samples, features]
Chapter | LSTM
# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1]))We can transform the prepared train and test input data into the expected structure using numpy.reshape() as follows:
Chapter | LSTM
# create and fit the LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back)))
model.add(Dense(1))The network has a visible layer with 1 input, a hidden layer with 4 LSTM blocks or neurons, and an output layer that makes a single value prediction.
Chapter | LSTM
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=5, batch_size=1, verbose=2)The network is trained for 5 epochs and a batch size of 1 is used.
Chapter | LSTM
In the neural network terminology:
Chapter | LSTM
Chapter | LSTM
Example: if you have 1000 training examples, and your batch size is 500, then it will take 2 iterations to complete 1 epoch.
Chapter | LSTM
# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])
# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))invert the predictions before calculating error scores to ensure that performance is reported in the same units as the original data
Chapter | LSTM
# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])
# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))invert the predictions before calculating error scores to ensure that performance is reported in the same units as the original data
Chapter | LSTM
# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1]))We are now ready to design and fit our LSTM network for this problem.