3.25. Two Spirals¶

This notebook explores the two-spirals category task. This is a good example of how to make a problem difficult for humans and neural networks.

[1]:

import conx as cx
import math

Using TensorFlow backend.
ConX, version 3.7.4

This task involves separating two categories, A and B, where the two sets spiral around each other.

First, let’s make the dataset:

[2]:

def spiral_xy(i, spiral_num):
    """
    Create the data for a spiral.

    Arguments:
        i runs from 0 to 96
        spiral_num is 1 or -1
    """
    φ = i/16 * math.pi
    r = 6.5 * ((104 - i)/104)
    x = (r * math.cos(φ) * spiral_num)/13 + 0.5
    y = (r * math.sin(φ) * spiral_num)/13 + 0.5
    return (x, y)

def spiral(spiral_num):
    return [spiral_xy(i, spiral_num) for i in range(97)]

[3]:

a = ["A", spiral(1)]
b = ["B", spiral(-1)]

cx.scatter([a,b])

[3]:

So, there it is: given the (x,y) coordinates of a point, can you determine if it belongs to category A or B. This is fairly easy to do given the picture. But very difficult only given the coordinates.

Nonetheless, this was an early challenge problem for neural networks, and much research was done in order to learn the task.

Many things were tried, with various levels of success!

For an overview of the task, and solutions see, for example:

https://www.researchgate.net/publication/220233514_Variations_of_the_two-spiral_task

Here is an attempt using so-called “shortcut” connections:

[4]:

net = cx.Network("Two-Spirals")
net.add(
    cx.Layer("input", 2),
    cx.Layer("hidden1", 5, activation="sigmoid"),
    cx.Layer("hidden2", 5, activation="sigmoid"),
    cx.Layer("hidden3", 5, activation="sigmoid"),
    cx.Layer("output", 2, activation="softmax")
)
net.connect("input", "hidden1")
net.connect("input", "hidden2")
net.connect("input", "hidden3")
net.connect("input", "output")
net.connect("hidden1", "hidden2")
net.connect("hidden1", "hidden3")
net.connect("hidden1", "output")
net.connect("hidden2", "hidden3")
net.connect("hidden2", "output")
net.connect("hidden3", "output")
net.build_model()

[5]:

net.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input (InputLayer)              (None, 2)            0
__________________________________________________________________________________________________
hidden1 (Dense)                 (None, 5)            15          input[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 7)            0           input[0][0]
                                                                 hidden1[0][0]
__________________________________________________________________________________________________
hidden2 (Dense)                 (None, 5)            40          concatenate_1[0][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 12)           0           input[0][0]
                                                                 hidden1[0][0]
                                                                 hidden2[0][0]
__________________________________________________________________________________________________
hidden3 (Dense)                 (None, 5)            65          concatenate_2[0][0]
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 17)           0           input[0][0]
                                                                 hidden1[0][0]
                                                                 hidden2[0][0]
                                                                 hidden3[0][0]
__________________________________________________________________________________________________
output (Dense)                  (None, 2)            36          concatenate_3[0][0]
==================================================================================================
Total params: 156
Trainable params: 156
Non-trainable params: 0
__________________________________________________________________________________________________

[6]:

net.dashboard()

[7]:

net.dataset.load([(xy, [1, 0]) for xy in spiral(1)] +
                 [(xy, [0, 1]) for xy in spiral(-1)])

[9]:

def schedule(start, end, num_steps):
    step = (end - start) / (num_steps - 1)
    current = start
    values = []
    for i in range(num_steps):
        values.append(current)
        current += step
    return values

[10]:

schedule(0.001, 0.002, 10)

[10]:

[0.001,
0011111111111111111,
0012222222222222222,
0013333333333333333,
0014444444444444444,
0015555555555555555,
0016666666666666666,
0017777777777777776,
0018888888888888887,
002]

[11]:

schedule(0.5, 0.95, 10)

[11]:

[0.5,
55,
6000000000000001,
6500000000000001,
7000000000000002,
7500000000000002,
8000000000000003,
8500000000000003,
9000000000000004,
9500000000000004]

[84]:

net.dataset.split(0)

[87]:

net.reset()

[88]:

for lr, m in zip(schedule(0.001, 0.002, 10),
                 schedule(0.5, 0.95, 10)):
    net.compile(error="categorical_crossentropy", optimizer='sgd', lr=lr, momentum=m)
    net.train(100, report_rate=10, batch_size=16, accuracy=1.0, tolerance=0.4, verbose=0)

[81]:

net.train(20000, report_rate=10, batch_size=16, accuracy=1.0, tolerance=0.4)

Interrupted! Cleaning up...
========================================================
       |  Training |  Training |  Validate |  Validate
Epochs |     Error |  Accuracy |     Error |  Accuracy
------ | --------- | --------- | --------- | ---------
# 1510 |   0.69089 |   0.50515 |   0.68765 |   0.56186

---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-81-c04797f683f1> in <module>()
----> 1 net.train(20000, report_rate=10, batch_size=16, accuracy=1.0, tolerance=0.4)

~/.local/lib/python3.6/site-packages/conx/network.py in train(self, epochs, accuracy, error, batch_size, report_rate, verbose, kverbose, shuffle, tolerance, class_weight, sample_weight, use_validation_to_stop, plot, record, callbacks, save)
   1468                 print("Saved!")
   1469         if interrupted:
-> 1470             raise KeyboardInterrupt
   1471         if verbose == 0:
   1472             return (self.epoch_count, self.history[-1])

KeyboardInterrupt:

[18]:

net.plot_activation_map()

However, I could never learn to do the task. Perhaps you can find some parameters that will work.

Or, perhaps we can just make this much easier for the neural network.

3.25.1. Picture-based Approach¶

In this formulation, we create “images” for each input and use a Convolutional layer.

[10]:

import conx as cx
import copy

We need to pick a resolution for the images. We chop up the input space into a 50 x 50 images.

[11]:

RESOLUTION = 50

[12]:

def make_picture(res):
    matrix = [[0.0 for i in range(res)]
              for j in range(res)]
    for x,y in spiral(1):
        x = min(int(round(x * res)), res - 1)
        y = min(int(round(y * res)), res - 1)
        matrix[1 - y][x] = 0.5
    for x,y in spiral(-1):
        x = min(int(round(x * res)), res - 1)
        y = min(int(round(y * res)), res - 1)
        matrix[1 - y][x] = 0.5
    return matrix

[13]:

matrix = make_picture(RESOLUTION)

[14]:

cx.array_to_image(matrix, shape=(RESOLUTION,RESOLUTION,1)).resize((400,400))

[14]:

In this example, we have three values:

background - 0.0
other data - 0.5
the target - 1.0

We could might be able to leave out the other data, but that seemed to make it more difficult. We want to let the network “see” the pattern.

[15]:

def make_data(res):
    data = []
    for x,y in spiral(1):
        x = min(int(round(x * res)), res - 1)
        y = min(int(round(y * res)), res - 1)
        inputs = copy.deepcopy(matrix)
        inputs[1 - y][x] = 1.0
        inputs = cx.reshape(inputs,(50,50,1))
        data.append([inputs, [0, 1]])
    for x,y in spiral(-1):
        x = min(int(round(x * res)), res - 1)
        y = min(int(round(y * res)), res - 1)
        inputs = copy.deepcopy(matrix)
        inputs[1 - y][x] = 1.0
        inputs = cx.reshape(inputs,(50,50,1))
        data.append([inputs, [1, 0]])
    return data

[16]:

data = make_data(RESOLUTION)

We create the simplest form of a Conv2DLayer network:

[17]:

net = cx.Network("Two-Spirals using Pictures")
net.add(
    cx.ImageLayer("input", (RESOLUTION, RESOLUTION), 1),
    cx.Conv2DLayer("conv2d", 2, 4),
    cx.FlattenLayer("flatten"),
    cx.Layer("output", 2, activation="softmax")
)
net.connect()
net.compile(error="categorical_crossentropy", optimizer="rmsprop")

[18]:

net.dataset.load(data)

[19]:

net.dashboard()

And try training it:

[20]:

net.reset()

[22]:

net.train(1000, accuracy=1.0, report_rate=10)

No training required: accuracy already to desired value
Training dataset status:
       |  Training |  Training
Epochs |     Error |  Accuracy
------ | --------- | ---------
#  386 |   0.24732 |   1.00000

It worked! This makes the task easy.

Let’s take a look at the generalization capability of the network by creating images that it wasn’t trained on over the 50 x 50 space:

[23]:

def test0(x, y, res=RESOLUTION):
    x = min(int(round(x * res)), res - 1)
    y = min(int(round(y * res)), res - 1)
    inputs = copy.deepcopy(matrix)
    inputs[1 - y][x] = 1.0
    inputs = cx.reshape(inputs,(50,50,1))
    return net.propagate(inputs)[0]

def test1(x, y, res=RESOLUTION):
    x = min(int(round(x * res)), res - 1)
    y = min(int(round(y * res)), res - 1)
    inputs = copy.deepcopy(matrix)
    inputs[1 - y][x] = 1.0
    inputs = cx.reshape(inputs,(50,50,1))
    return net.propagate(inputs)[1]

cx.view([cx.heatmap(test0, format="image"), cx.heatmap(test1, format="image")],
        labels=["output[0]","output[1]"], scale=7.0)

Sometimes it creates fairly smooth spirals. However, other times, it may just “memorize” the problem, with no particular pattern. Which do you get?