3.11. Alice in Wonderland¶
This notebook demonstrates generating sequences using a Simple Recurrent Network (SimpleRNN).
For this example, we will use the unprocessed text from Lewis Carroll’s “Alice in Wonderland”. However, the sequence can really be anything, including code, music, or knitting instructions.
In [1]:
import conx as cx
Using TensorFlow backend.
Conx, version 3.6.1
First, we find a copy of Alice in Wonderland, download it, and read it in:
In [2]:
INPUT_FILE = "alice_in_wonderland.txt"
In [3]:
cx.download("http://www.gutenberg.org/files/11/11-0.txt", filename=INPUT_FILE)
Using cached http://www.gutenberg.org/files/11/11-0.txt as './alice_in_wonderland.txt'.
In [4]:
# extract the input as a stream of characters
lines = []
with open(INPUT_FILE, 'rb') as fp:
for line in fp:
line = line.strip().lower()
line = line.decode("ascii", "ignore")
if len(line) == 0:
continue
lines.append(line)
text = " ".join(lines)
lines = None # clean up memory
Next, we create some utility dictionaries for mapping the characters to indices and back:
In [5]:
chars = set([c for c in text])
nb_chars = len(chars)
char2index = dict((c, i) for i, c in enumerate(chars))
index2char = dict((i, c) for i, c in enumerate(chars))
In [6]:
nb_chars
Out[6]:
55
In this text, there are 55 different characters.
Each character has a unique mapping to an integer:
In [7]:
char2index["a"]
Out[7]:
24
In [8]:
index2char[5]
Out[8]:
']'
3.11.1. Build the Dataset¶
For example, assume an input sequence of “the sky was falling”, we would get the following inputs and targets:
Inputs -> Target
---------- ------
the sky wa -> s
he sky was ->
e sky was -> f
sky was f -> a
sky was fa -> l
How can we represent the characters? There are many ways, including
using an EmbeddingLayer. In this example, we simply use a onehot
encoding of the index. Note that the total length of the onehot encoding
is one more than the total number of items. That is because we will use
a position for the zero index as well.
In [9]:
SEQLEN = 10
data = []
for i in range(0, len(text) - SEQLEN):
inputs = [cx.onehot(char2index[char], nb_chars + 1) for char in text[i:i + SEQLEN]]
targets = [cx.onehot(char2index[char], nb_chars + 1) for char in text[i + SEQLEN]][0]
data.append([inputs, targets])
text = None # clean up memory
In [10]:
dataset = cx.Dataset()
dataset.load(data)
data = None # clean up memory; not needed
In [11]:
len(dataset)
Out[11]:
158773
In [12]:
cx.shape(dataset.inputs[0])
Out[12]:
(10, 56)
The shape of the inputs is 10 x 56; a sequence of length 10, and a vector of length 56.
Let’s check the inputs and targets to make sure everything is encoded properly:
In [13]:
def onehot_to_char(vector):
index = cx.argmax(vector)
return index2char[index]
In [14]:
for i in range(10):
print("".join([onehot_to_char(v) for v in dataset.inputs[i]]),
"->",
onehot_to_char(dataset.targets[i]))
project gu -> t
roject gut -> e
oject gute -> n
ject guten -> b
ect gutenb -> e
ct gutenbe -> r
t gutenber -> g
gutenberg -> s
gutenbergs ->
utenbergs -> a
Looks good!
3.11.2. Build the Network¶
We will use a single SimpleRNNLayer with a fully-connected output bank to compute the most likely predicted output character.
Note that we can use the categorical cross-entropy error function since we are using the “softmax” activation function on the output layer.
In this example, we unroll the inputs to provide explicit weights between each character in the sequence and the output.
In [15]:
network = cx.Network("Alice in Wonderland")
network.add(
cx.Layer("input", (SEQLEN, nb_chars + 1)),
cx.SimpleRNNLayer("rnn", 128,
return_sequences=False,
unroll=True),
cx.Layer("output", nb_chars + 1, activation="softmax"),
)
network.connect()
network.compile(error="categorical_crossentropy", optimizer="rmsprop")
In [16]:
network.set_dataset(dataset)
In [17]:
network.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) (None, 10, 56) 0
_________________________________________________________________
rnn (SimpleRNN) (None, 128) 23680
_________________________________________________________________
output (Dense) (None, 56) 7224
=================================================================
Total params: 30,904
Trainable params: 30,904
Non-trainable params: 0
_________________________________________________________________
In [18]:
network.dashboard()
3.11.3. Train the Network¶
After each training epoch we will test the generated output.
We could use cx.choice(p=output)
or cx.argmax(output)
for
picking the next character. Which works best for you?
In [19]:
def generate_text(sequence, count):
for i in range(count):
output = network.propagate(sequence)
char = index2char[cx.argmax(output)]
print(char, end="")
sequence = sequence[1:] + [output]
print()
In [20]:
for iteration in range(25):
print("=" * 50)
print("Iteration #: %d" % (network.epoch_count))
results = network.train(1, batch_size=128, plot=False, verbose=0)
sequence = network.dataset.inputs[cx.choice(len(network.dataset))]
print("Generating from seed: %s" % ("".join([onehot_to_char(v) for v in sequence])))
generate_text(sequence, 100)
network.plot_results()
==================================================
Iteration #: 0
Generating from seed: in in the
kitt t
==================================================
Iteration #: 1
Generating from seed: w computer
e i t tt t tt t tt t tt t ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt
==================================================
Iteration #: 2
Generating from seed: ns! she at
ee i e e te
==================================================
Iteration #: 3
Generating from seed: es, but i
sont to t t tt t to t t t t tt t t t t tt t tt t t t t tt t tt t t t t tt t t t t tt t tt t t
==================================================
Iteration #: 4
Generating from seed: it lasted.
io e es i t t e e t e e e e e e e ee ee ee
==================================================
Iteration #: 5
Generating from seed: wo-- why,
sa e aiee to ire ot tore oe ore ie tor t ie toe tire ioe tire iot tore ie or t ie t
==================================================
Iteration #: 6
Generating from seed: time it a
nd eostes o se sese s st site si t si tes t te t se to t si test tort sitt si t st s st s st tite
==================================================
Iteration #: 7
Generating from seed: works bas
one tore tor tore ior tore oor tore tor iore tor tore tor tore tor tore tor tore tor tore tor tore
==================================================
Iteration #: 8
Generating from seed: he helped
anries tit sirestt s totst ti tor str s tor sttre tir stirestar tires to tires to stites tor sttr
==================================================
Iteration #: 9
Generating from seed: very tired
alicerel s til e o se oit tore oor tiaes torse oate oo etine io ses ee ar tie se to slare to l
==================================================
Iteration #: 10
Generating from seed: project gu
tenberg-ta tarites tt a ailis to slaitis or t airit to slares atts arirs ar iles tt a tiris tore tt
==================================================
Iteration #: 11
Generating from seed: neck from
tiries airele aitiie tirisa e iines i s ares sire ain line tine tonele eseris irm tone on lorsle
==================================================
Iteration #: 12
Generating from seed: eople in a
longe soit toree at t e eses rotel se or itee io teine to teine a ter s aorses oo toriee ti ele
==================================================
Iteration #: 13
Generating from seed: inkling be
ini r n ot an late a s liti a t ne ala ont rotils tot i lali aont antire a a s attin attiree to
==================================================
Iteration #: 14
Generating from seed: whether th
e e tors on hore on trene eat lare aone oon lone oon lire an eroee or eerel oir line oorelare an
==================================================
Iteration #: 15
Generating from seed: said alic
e, alict iaial fte y ni son o teres ar an ate l an o aleitid an l tinie atnerar ai ror tins a
==================================================
Iteration #: 16
Generating from seed: elf safe i
n thaiteds on hitiled ootiniri ass ione io toi ey armere on betele ont rone ain sares on thete in
==================================================
Iteration #: 17
Generating from seed: ad kept a
murses oi iny aone aotired on rictire oonelion ior iite aitere on aotesiog oo ely aibe aotelinu ao
==================================================
Iteration #: 18
Generating from seed: the thistl
enes arocesiry or trtiry tone on ireiler or eraily tone sontered ors tone rone oon ests ers cous
==================================================
Iteration #: 19
Generating from seed: d another
wontetin toree ont rethoree lonsini or hitil oor eone on tiois ootely tone ros oolily oinil ite
==================================================
Iteration #: 20
Generating from seed: nds and fe
et ani eriey tine ionteron lore ainee outine ainiee inner aniine ione ioner on lootioneretne eriaes
==================================================
Iteration #: 21
Generating from seed: ative work
onenenane ontl oute itnelyin lire aone inite astins ottestant ot elas tome sone ooterserision i
==================================================
Iteration #: 22
Generating from seed: nge tale,
autier ait litely sine ton h gesery orerree oon int rnttle on tranese oo mirteron totily ait lins
==================================================
Iteration #: 23
Generating from seed: round, if
i mone tone sont ta elporsetan lols aimile on eset tors thre sone sone to toine aiser aonel ou elin
==================================================
Iteration #: 24
Generating from seed: f, i wonde
r shane aape son lo erate san looklen on erters wrnc sothin liousesotel sine oor hires at tame rine
What can you say about the text generated in later epochs compared to the earlier generated text?
This was the simplest and most straightforward of network architectures and parameter settings. Can you do better? Can you generate text that is better English, or even text that captures the style of Lewis Carroll?
Next, you might like to try this kind of experiment on your own sequential data.