3.25. Two Spirals¶
This notebook explores the two-spirals category task. This is a good example of how to make a problem difficult for humans and neural networks.
[1]:
import conx as cx
import math
Using TensorFlow backend.
ConX, version 3.7.4
This task involves separating two categories, A and B, where the two sets spiral around each other.
First, let’s make the dataset:
[2]:
def spiral_xy(i, spiral_num):
"""
Create the data for a spiral.
Arguments:
i runs from 0 to 96
spiral_num is 1 or -1
"""
φ = i/16 * math.pi
r = 6.5 * ((104 - i)/104)
x = (r * math.cos(φ) * spiral_num)/13 + 0.5
y = (r * math.sin(φ) * spiral_num)/13 + 0.5
return (x, y)
def spiral(spiral_num):
return [spiral_xy(i, spiral_num) for i in range(97)]
[3]:
a = ["A", spiral(1)]
b = ["B", spiral(-1)]
cx.scatter([a,b])
[3]:
So, there it is: given the (x,y) coordinates of a point, can you determine if it belongs to category A or B. This is fairly easy to do given the picture. But very difficult only given the coordinates.
Nonetheless, this was an early challenge problem for neural networks, and much research was done in order to learn the task.
Many things were tried, with various levels of success!
For an overview of the task, and solutions see, for example:
https://www.researchgate.net/publication/220233514_Variations_of_the_two-spiral_task
Here is an attempt using so-called “shortcut” connections:
[4]:
net = cx.Network("Two-Spirals")
net.add(
cx.Layer("input", 2),
cx.Layer("hidden1", 5, activation="sigmoid"),
cx.Layer("hidden2", 5, activation="sigmoid"),
cx.Layer("hidden3", 5, activation="sigmoid"),
cx.Layer("output", 2, activation="softmax")
)
net.connect("input", "hidden1")
net.connect("input", "hidden2")
net.connect("input", "hidden3")
net.connect("input", "output")
net.connect("hidden1", "hidden2")
net.connect("hidden1", "hidden3")
net.connect("hidden1", "output")
net.connect("hidden2", "hidden3")
net.connect("hidden2", "output")
net.connect("hidden3", "output")
net.build_model()
[5]:
net.summary()
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input (InputLayer) (None, 2) 0
__________________________________________________________________________________________________
hidden1 (Dense) (None, 5) 15 input[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 7) 0 input[0][0]
hidden1[0][0]
__________________________________________________________________________________________________
hidden2 (Dense) (None, 5) 40 concatenate_1[0][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate) (None, 12) 0 input[0][0]
hidden1[0][0]
hidden2[0][0]
__________________________________________________________________________________________________
hidden3 (Dense) (None, 5) 65 concatenate_2[0][0]
__________________________________________________________________________________________________
concatenate_3 (Concatenate) (None, 17) 0 input[0][0]
hidden1[0][0]
hidden2[0][0]
hidden3[0][0]
__________________________________________________________________________________________________
output (Dense) (None, 2) 36 concatenate_3[0][0]
==================================================================================================
Total params: 156
Trainable params: 156
Non-trainable params: 0
__________________________________________________________________________________________________
[6]:
net.dashboard()
[7]:
net.dataset.load([(xy, [1, 0]) for xy in spiral(1)] +
[(xy, [0, 1]) for xy in spiral(-1)])
[9]:
def schedule(start, end, num_steps):
step = (end - start) / (num_steps - 1)
current = start
values = []
for i in range(num_steps):
values.append(current)
current += step
return values
[10]:
schedule(0.001, 0.002, 10)
[10]:
[0.001,
0.0011111111111111111,
0.0012222222222222222,
0.0013333333333333333,
0.0014444444444444444,
0.0015555555555555555,
0.0016666666666666666,
0.0017777777777777776,
0.0018888888888888887,
0.002]
[11]:
schedule(0.5, 0.95, 10)
[11]:
[0.5,
0.55,
0.6000000000000001,
0.6500000000000001,
0.7000000000000002,
0.7500000000000002,
0.8000000000000003,
0.8500000000000003,
0.9000000000000004,
0.9500000000000004]
[84]:
net.dataset.split(0)
[87]:
net.reset()
[88]:
for lr, m in zip(schedule(0.001, 0.002, 10),
schedule(0.5, 0.95, 10)):
net.compile(error="categorical_crossentropy", optimizer='sgd', lr=lr, momentum=m)
net.train(100, report_rate=10, batch_size=16, accuracy=1.0, tolerance=0.4, verbose=0)
[81]:
net.train(20000, report_rate=10, batch_size=16, accuracy=1.0, tolerance=0.4)
Interrupted! Cleaning up...
========================================================
| Training | Training | Validate | Validate
Epochs | Error | Accuracy | Error | Accuracy
------ | --------- | --------- | --------- | ---------
# 1510 | 0.69089 | 0.50515 | 0.68765 | 0.56186
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
<ipython-input-81-c04797f683f1> in <module>()
----> 1 net.train(20000, report_rate=10, batch_size=16, accuracy=1.0, tolerance=0.4)
~/.local/lib/python3.6/site-packages/conx/network.py in train(self, epochs, accuracy, error, batch_size, report_rate, verbose, kverbose, shuffle, tolerance, class_weight, sample_weight, use_validation_to_stop, plot, record, callbacks, save)
1468 print("Saved!")
1469 if interrupted:
-> 1470 raise KeyboardInterrupt
1471 if verbose == 0:
1472 return (self.epoch_count, self.history[-1])
KeyboardInterrupt:
[18]:
net.plot_activation_map()
However, I could never learn to do the task. Perhaps you can find some parameters that will work.
Or, perhaps we can just make this much easier for the neural network.
3.25.1. Picture-based Approach¶
In this formulation, we create “images” for each input and use a Convolutional layer.
[10]:
import conx as cx
import copy
We need to pick a resolution for the images. We chop up the input space into a 50 x 50 images.
[11]:
RESOLUTION = 50
[12]:
def make_picture(res):
matrix = [[0.0 for i in range(res)]
for j in range(res)]
for x,y in spiral(1):
x = min(int(round(x * res)), res - 1)
y = min(int(round(y * res)), res - 1)
matrix[1 - y][x] = 0.5
for x,y in spiral(-1):
x = min(int(round(x * res)), res - 1)
y = min(int(round(y * res)), res - 1)
matrix[1 - y][x] = 0.5
return matrix
[13]:
matrix = make_picture(RESOLUTION)
[14]:
cx.array_to_image(matrix, shape=(RESOLUTION,RESOLUTION,1)).resize((400,400))
[14]:
In this example, we have three values:
- background - 0.0
- other data - 0.5
- the target - 1.0
We could might be able to leave out the other data, but that seemed to make it more difficult. We want to let the network “see” the pattern.
[15]:
def make_data(res):
data = []
for x,y in spiral(1):
x = min(int(round(x * res)), res - 1)
y = min(int(round(y * res)), res - 1)
inputs = copy.deepcopy(matrix)
inputs[1 - y][x] = 1.0
inputs = cx.reshape(inputs,(50,50,1))
data.append([inputs, [0, 1]])
for x,y in spiral(-1):
x = min(int(round(x * res)), res - 1)
y = min(int(round(y * res)), res - 1)
inputs = copy.deepcopy(matrix)
inputs[1 - y][x] = 1.0
inputs = cx.reshape(inputs,(50,50,1))
data.append([inputs, [1, 0]])
return data
[16]:
data = make_data(RESOLUTION)
We create the simplest form of a Conv2DLayer network:
[17]:
net = cx.Network("Two-Spirals using Pictures")
net.add(
cx.ImageLayer("input", (RESOLUTION, RESOLUTION), 1),
cx.Conv2DLayer("conv2d", 2, 4),
cx.FlattenLayer("flatten"),
cx.Layer("output", 2, activation="softmax")
)
net.connect()
net.compile(error="categorical_crossentropy", optimizer="rmsprop")
[18]:
net.dataset.load(data)
[19]:
net.dashboard()
And try training it:
[20]:
net.reset()
[22]:
net.train(1000, accuracy=1.0, report_rate=10)
No training required: accuracy already to desired value
Training dataset status:
| Training | Training
Epochs | Error | Accuracy
------ | --------- | ---------
# 386 | 0.24732 | 1.00000
It worked! This makes the task easy.
Let’s take a look at the generalization capability of the network by creating images that it wasn’t trained on over the 50 x 50 space:
[23]:
def test0(x, y, res=RESOLUTION):
x = min(int(round(x * res)), res - 1)
y = min(int(round(y * res)), res - 1)
inputs = copy.deepcopy(matrix)
inputs[1 - y][x] = 1.0
inputs = cx.reshape(inputs,(50,50,1))
return net.propagate(inputs)[0]
def test1(x, y, res=RESOLUTION):
x = min(int(round(x * res)), res - 1)
y = min(int(round(y * res)), res - 1)
inputs = copy.deepcopy(matrix)
inputs[1 - y][x] = 1.0
inputs = cx.reshape(inputs,(50,50,1))
return net.propagate(inputs)[1]
cx.view([cx.heatmap(test0, format="image"), cx.heatmap(test1, format="image")],
labels=["output[0]","output[1]"], scale=7.0)
Sometimes it creates fairly smooth spirals. However, other times, it may just “memorize” the problem, with no particular pattern. Which do you get?