3.1. Learning¶
The shallowest network is one that has no hidden layers at all. But this type of network can only solve one type of problem: those that are linearly separable. This notebook explores learning linearly and non-lineraly separable datasets.
3.1.1. Linearly Separable¶
In [1]:
import conx as cx
import random
Using Theano backend.
Conx, version 3.6.0
First, let’s construct a fake linearly-separable dataset.
In [2]:
count = 500
positives = [(i/count, i/(count * 2) + random.random()/6) for i in range(count)]
negatives = [(i/count, 0.3 + i/(count * 2) + random.random()/6) for i in range(count)]
In [3]:
cx.scatter([
["Positive", positives],
["Negative", negatives],
],
height=8.0,
width=8.0,
symbols={"Positive": "bo", "Negative": "ro"})
Out[3]:
In [4]:
ds = cx.Dataset()
In [5]:
ds.load([(p, [ 1.0], "Positive") for p in positives] +
[(n, [ 0.0], "Negative") for n in negatives])
In [6]:
ds.shuffle()
In [7]:
ds.split(.1)
In [8]:
ds.summary()
_________________________________________________________________
Unnamed Dataset:
Patterns Shape Range
=================================================================
inputs (2,) (0.0, 0.998)
targets (1,) (0.0, 1.0)
=================================================================
Total patterns: 1000
Training patterns: 900
Testing patterns: 100
_________________________________________________________________
In [9]:
net = cx.Network("Linearly Separable", 2, 1, activation="sigmoid")
net.compile(error="mae", optimizer="adam")
In [10]:
net.set_dataset(ds)
In [11]:
net.dashboard()
In [12]:
net.test(tolerance=0.4)
========================================================
Testing validation dataset with tolerance 0.4...
Total count: 900
correct: 246
incorrect: 654
Total percentage correct: 0.2733333333333333
In [13]:
symbols = {
"Positive (correct)": "w+",
"Positive (wrong)": "k+",
"Negative (correct)": "w_",
"Negative (wrong)": "k_",
}
net.plot_activation_map(scatter=net.test(tolerance=0.4, interactive=False),
symbols=symbols, title="Before Training")
In [14]:
if net.saved():
net.load()
net.plot_results()
else:
net.train(epochs=10000, accuracy=1.0, report_rate=50,
tolerance=0.4, batch_size=len(net.dataset.train_inputs),
plot=True, record=100, save=True)
========================================================
| Training | Training | Validate | Validate
Epochs | Error | Accuracy | Error | Accuracy
------ | --------- | --------- | --------- | ---------
# 6775 | 0.26114 | 1.00000 | 0.26946 | 1.00000
Saving network... Saved!
In [15]:
net.plot_activation_map(scatter=net.test(tolerance=0.4, interactive=False),
symbols=symbols, title="After Training")
In [16]:
net.get_weights("output")
Out[16]:
[[[3.3727526664733887], [-7.073390007019043]], [1.7067580223083496]]
In [17]:
from conx.activations import sigmoid
def output(x, y):
wts = net.get_weights("output")
return sigmoid(x * wts[0][1][0] + y * wts[0][0][0] + wts[1][0])
def ascii(f):
return "%4.1f" % f
In [18]:
for y in cx.frange(0, 1.1, .1):
for x in cx.frange(1.0, 0.1, -0.1):
print(ascii(output(x, y)), end=" ")
print()
0.0 0.0 0.0 0.0 0.1 0.1 0.2 0.4 0.6
0.0 0.0 0.0 0.1 0.1 0.2 0.3 0.5 0.7
0.0 0.0 0.0 0.1 0.1 0.2 0.4 0.6 0.7
0.0 0.0 0.1 0.1 0.2 0.3 0.5 0.6 0.8
0.0 0.0 0.1 0.1 0.2 0.4 0.6 0.7 0.8
0.0 0.0 0.1 0.2 0.3 0.5 0.6 0.8 0.9
0.0 0.1 0.1 0.2 0.4 0.5 0.7 0.8 0.9
0.0 0.1 0.2 0.3 0.5 0.6 0.8 0.9 0.9
0.1 0.1 0.2 0.4 0.5 0.7 0.8 0.9 1.0
0.1 0.2 0.3 0.4 0.6 0.8 0.9 0.9 1.0
0.1 0.2 0.4 0.5 0.7 0.8 0.9 1.0 1.0
In [23]:
net.playback(lambda net, epoch: net.plot_activation_map(title="Epoch %s" % epoch,
scatter=net.test(tolerance=0.4, interactive=False),
symbols=symbols,
format="svg"))
In [24]:
net.set_weights_from_history(-1)
In [27]:
net.movie(lambda net, epoch: net.plot_activation_map(title="Epoch %s" % epoch,
scatter=net.test(tolerance=0.4, interactive=False),
symbols=symbols,
format="image"))
Out[27]:
3.1.2. Non-Linearly Separable¶
In [28]:
import math
In [29]:
def distance(x1, y1, x2, y2):
return math.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2)
In [30]:
negatives = []
while len(negatives) < 500:
x = random.random()
y = random.random()
d = distance(x, y, 0.5, 0.5)
if d > 0.375 and d < 0.5:
negatives.append([x, y])
positives = []
while len(positives) < 500:
x = random.random()
y = random.random()
d = distance(x, y, 0.5, 0.5)
if d < 0.25:
positives.append([x, y])
In [31]:
cx.scatter([
["Positive", positives],
["Negative", negatives],
],
height=8.0,
width=8.0,
symbols={"Positive": "bo", "Negative": "ro"})
Out[31]:
In [32]:
net = cx.Network("Non-Linearly Separable", 2, 5, 1, activation="sigmoid")
net.compile(error="mae", optimizer="adam")
In [33]:
net.picture()
Out[33]:
In [34]:
ds = cx.Dataset()
In [35]:
ds.load([(p, [ 1.0], "Positive") for p in positives] +
[(n, [ 0.0], "Negative") for n in negatives])
In [36]:
ds.shuffle()
In [37]:
ds.split(.1)
In [38]:
net.set_dataset(ds)
In [39]:
net.test(tolerance=0.4)
========================================================
Testing validation dataset with tolerance 0.4...
Total count: 900
correct: 449
incorrect: 451
Total percentage correct: 0.4988888888888889
In [40]:
net.dashboard()
In [41]:
net.plot_activation_map(scatter=net.test(interactive=False), symbols=symbols, title="Before Training")
You may want to either net.reset()
or net.retrain()
if the
following cell doesn’t complete with 100% accuracy. Calling
net.reset()
may be needed if the network has landed in a local
maxima; net.retrain()
may be necessary if the network just needs
additional training.
In [44]:
if net.saved():
net.load()
net.plot_results()
else:
net.train(epochs=10000, accuracy=1.0, report_rate=50,
tolerance=0.4, batch_size=256,
plot=True, record=100, save=True)
========================================================
| Training | Training | Validate | Validate
Epochs | Error | Accuracy | Error | Accuracy
------ | --------- | --------- | --------- | ---------
#17746 | 0.02727 | 1.00000 | 0.02608 | 1.00000
Saving network... Saved!
In [45]:
net.plot_activation_map(scatter=net.test(interactive=False), symbols=symbols, title="After Training")
In [46]:
net.get_weights("hidden")
Out[46]:
[[[-6.404449939727783,
7.37056827545166,
-12.947518348693848,
7.471460819244385,
-7.8771443367004395],
[-11.006869316101074,
11.604531288146973,
-1.4833985567092896,
-14.498926162719727,
-12.94011116027832]],
[5.999775409698486,
9.866920471191406,
10.296037673950195,
7.081101894378662,
7.790738582611084]]
In [47]:
net.get_weights_as_image("hidden").resize((400, 200))
Out[47]:
In [48]:
net.get_weights("output")
Out[48]:
[[[-26.79941177368164],
[-7.317086696624756],
[14.472185134887695],
[12.987898826599121],
[9.788432121276855]],
[-10.76565933227539]]
In [49]:
net.get_weights_as_image("output").resize((500, 100))
Out[49]:
In [50]:
for y in cx.frange(0, 1.1, .1):
for x in cx.frange(1.0, 0.1, -0.1):
print(ascii(net.propagate([x, y])[0]), end=" ")
print()
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.1 0.6 0.7 0.3 0.1 0.0 0.0
0.0 0.0 0.3 0.9 1.0 1.0 0.8 0.3 0.0
0.0 0.0 0.3 1.0 1.0 1.0 1.0 1.0 0.6
0.0 0.0 0.3 1.0 1.0 1.0 1.0 1.0 0.9
0.0 0.0 0.2 0.9 1.0 1.0 1.0 1.0 0.7
0.0 0.0 0.1 0.8 1.0 1.0 0.9 0.5 0.1
0.0 0.0 0.0 0.1 0.4 0.4 0.2 0.1 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
In [51]:
net.playback(lambda net, epoch: net.plot_activation_map(title="Epoch: %s" % epoch,
scatter=net.test(interactive=False),
symbols=symbols,
format="svg"))
In [52]:
net.movie(lambda net, epoch: net.plot_activation_map(title="Epoch %s" % epoch,
scatter=net.test(tolerance=0.4, interactive=False),
symbols=symbols,
format="image"))
Out[52]: