3.13. LSTM - Long Short Term Memory¶

http://data.is/1bKs2mG

International airline passengers: monthly totals in thousands. Jan 49 – Dec 60

In [1]:

from conx import Network, Layer, LSTMLayer, plot, frange

Using Theano backend.
conx, version 3.5.4


For this experiment, we will use the monthly money spent by international airline passengers on tickets between 1949 and 1961:

In [2]:

data = [112, 118, 132, 129, 121, 135, 148, 148, 136, 119, 104, 118, 115,
126, 141, 135, 125, 149, 170, 170, 158, 133, 114, 140, 145, 150,
178, 163, 172, 178, 199, 199, 184, 162, 146, 166, 171, 180, 193,
181, 183, 218, 230, 242, 209, 191, 172, 194, 196, 196, 236, 235,
229, 243, 264, 272, 237, 211, 180, 201, 204, 188, 235, 227, 234,
264, 302, 293, 259, 229, 203, 229, 242, 233, 267, 269, 270, 315,
364, 347, 312, 274, 237, 278, 284, 277, 317, 313, 318, 374, 413,
405, 355, 306, 271, 306, 315, 301, 356, 348, 355, 422, 465, 467,
404, 347, 305, 336, 340, 318, 362, 348, 363, 435, 491, 505, 404,
359, 310, 337, 360, 342, 406, 396, 420, 472, 548, 559, 463, 407,
362, 405, 417, 391, 419, 461, 472, 535, 622, 606, 508, 461, 390,
432]


Plotting the data shows a regular, but varying, cyclic pattern:

In [3]:

plot(["\$", data],
title="International airline passengers: monthly totals in thousands. Jan 49 – Dec 61",
xlabel="year",
ylabel="dollars (thousands)",
xs=[x for x in frange(1949, 1961, 1/12)])


Let’s scale the dollar amounts into the range 0 - 1:

In [4]:

def scale(data):
"""
Scale data to between 0 and 1
"""
minv = min(data)
maxv = max(data)
span = maxv - minv
return [(v - minv)/span for v in data]

In [5]:

scaled_data = scale(data)

In [6]:

plot(["Scaled Data", scaled_data])


For our dataset, we will contruct a history sequence. First, we need to put each scaled dollar amount into a list. This is the list of features. In our case, we just have the one feature:

In [7]:

sequence = [[datum] for datum in scaled_data]


We wish that the inputs -> targets are constructed as follows:

1. [S0] -> S1
2. [S1] -> S2

where Sn is a list of features in the sequence.

We need to inform the network of the shape of the sequence. We need the:

• time_steps - the length of the history
• batch_size - how many vectors are the inputs composed of?
• features - the length of each input bank vector
In [8]:

time_steps = 1  # history
batch_size = 1  # how many to load at once
features = 1    # features (length of input vector)

In [9]:

def create_dataset(sequence, time_steps):
dataset = []
for i in range(len(sequence)-time_steps-1):
dataset.append([sequence[i:(i+time_steps)],
sequence[i + time_steps]])
return dataset

In [10]:

dataset = create_dataset(sequence, time_steps)

In [11]:

print(dataset[0])
print(dataset[1])

[[[0.015444015444015444]], [0.02702702702702703]]
[[[0.02702702702702703]], [0.05405405405405406]]


Now we construct the network giving the batch_shape in terms of (look_back, banks, width):

In [12]:

net = Network("LSTM")
net.connect()

In [13]:

net.dataset.load(dataset)

In [14]:

net.dashboard()

In [15]:

net.dataset.split(.33)

In [16]:

net.propagate([[.02]])

Out[16]:

[-0.0006034541293047369]

In [17]:

outputs = [net.propagate(i, visualize=False) for i in net.dataset.inputs]
plot([["Network", outputs], ["Training data", net.dataset.targets]])

In [19]:

if net.saved():
net.plot_loss_acc()
else:
net.train(100, batch_size=batch_size, shuffle=False, plot=True, save=True)

========================================================================
|  Training |  Training |  Validate |  Validate
Epochs |     Error |  Accuracy |     Error |  Accuracy
------ | --------- | --------- | --------- | ---------
#  100 |   0.00203 |   0.97895 |   0.00815 |   0.72340
Saving network... Saved!

In [20]:

outputs = [net.propagate(i) for i in net.dataset.inputs]
plot([["Network", outputs], ["Training data", net.dataset.targets]])


3.14. LSTM with Window¶

In [21]:

time_steps = 3

In [22]:

dataset = create_dataset(sequence, time_steps)

In [23]:

print(dataset[0])
print(dataset[1])

[[[0.015444015444015444], [0.02702702702702703], [0.05405405405405406]], [0.04826254826254826]]
[[[0.02702702702702703], [0.05405405405405406], [0.04826254826254826]], [0.032818532818532815]]

In [37]:

net2 = Network("LSTM with Window")
net2.connect()

In [38]:

net2.dataset.load(dataset)
net2.dataset.split(.33)

In [39]:

net2

Out[39]:

In [40]:

net2.propagate([[0.1], [0.2], [0.3]])

Out[40]:

[0.020247535780072212]

In [41]:

outputs = [net2.propagate(i, visualize=False) for i in net2.dataset.inputs]
plot([["Network", outputs], ["Training data", net2.dataset.targets]])

In [44]:

if net2.saved():
net2.plot_loss_acc()
else:
net2.train(100, batch_size=batch_size, shuffle=False, plot=True, save=True)

========================================================================
|  Training |  Training |  Validate |  Validate
Epochs |     Error |  Accuracy |     Error |  Accuracy
------ | --------- | --------- | --------- | ---------
#  100 |   0.00264 |   0.93548 |   0.01089 |   0.63830
Saving network... Saved!

In [31]:

outputs = [net2.propagate(i, visualize=False) for i in net2.dataset.inputs]
plot([["Network", outputs], ["Training data", net2.dataset.targets]])


3.14.1. LSTM with State¶

In [32]:

net3 = Network("LSTM with Window and State")
net3.connect()

In [62]:

net3.dataset.load(dataset)
net3.dataset.split(.33)

In [63]:

net2

Out[63]:

In [64]:

if net3.saved():
net3.plot_loss_acc()
else:
net3.train(100, batch_size=batch_size, shuffle=False, plot=True, save=True)

========================================================================
|  Training |  Training
Epochs |     Error |  Accuracy
------ | --------- | ---------
#  100 |   0.00376 |   0.90714
Saving network... Saved!

In [84]:

outputs = [net3.propagate(i, visualize=False) for i in net3.dataset.inputs]
plot([["Network", outputs], ["Training data", net3.dataset.targets]])


3.14.2. LSTM - Stacked¶

In [75]:

net4 = Network("LSTM with Window and State and Stacked")
net4.connect()

In [76]:

net4.dataset.load(dataset)
net4.dataset.split(.33)

In [77]:

net4

Out[77]:

In [78]:

net4.propagate([[0.1], [-0.2], [0.8]])

Out[78]:

[0.010831544175744057]

In [79]:

if net4.saved():
net4.plot_loss_acc()
else:
net4.train(100, batch_size=batch_size, shuffle=False, plot=True, save=True)

========================================================================
|  Training |  Training
Epochs |     Error |  Accuracy
------ | --------- | ---------
#  100 |   0.00320 |   0.92143
Saving network... Saved!

In [82]:

outputs = [net4.propagate(i, visualize=False) for i in net4.dataset.inputs]
plot([["Network", outputs], ["Training data", net4.dataset.targets]])