Getting Started with conx

What is conx?

conx is an accessible and powerful way to build and understand deep learning neural networks. Specifically, it sits on top of Keras, which sits on top of Theano, TensorFlow, or CNTK.

conx:

  • has an easy to use interface for creating connections between layers of a neural network
  • adds additional functionality for manipulating neural networks
  • supports visualizations and analysis for training and using neural networks
  • has everything you need; doesn’t require knowledge of complicated numerical or plotting libraries
  • integrates with lower-level (Keras) if you wish

But rather than attempting to explain each of these points, let’s demonstrate them.

This demonstration is being run in a Jupyter Notebook. conx doesn’t require running in the notebook, but if you do, you will be able to use the visualizations and dashboard.

A Simple Network

As a demonstration, let’s build a simple networkd for learning the XOR (exclusive or) truth table. XOR is defined as:

Input Output
0, 0 0
0, 1 1
1, 0 1
1, 1 0

Step 1: import conx

We will need the Network, and Layer classes from the conx module:

In [1]:
from conx import Network, Layer
conx, version 3.3.3
Using TensorFlow backend.

Step 2: create the network

Every network needs a name:

In [2]:
net = Network("XOR Network")

Step 3: add the needed layers

Every layer needs a name and a size. We add each of the layers of our network. The first layer will be an “input” layer (named “input1”). We only need to specify the size. For our XOR problem, there are two inputs:

In [3]:
net.add(Layer("input1", 2))

For the next layers, we will also use the default layer type for hidden and output layers. However, we also need to specify the function to apply to the “net inputs” to the layer, after the matrix multiplications. We have a few choices for which activation functions to use:

  • ‘relu’
  • ‘sigmoid’
  • ‘linear’
  • ‘softmax’
  • ‘tanh’
  • ‘elu’
  • ‘selu’
  • ‘softplus’
  • ‘softsign’
  • ‘hard_sigmoid’

You can try any of these. “relu” is short for Rectified Linear Unit and is known for being generally useful for hidden layer activations. Likewise, the sigmoid function is generally useful for output layer activation functions. We’ll try those, respectively, but you can experiment.

In [4]:
net.add(Layer("hidden1", 5, activation="relu"))
net.add(Layer("output1", 1, activation="sigmoid"))

Step 4: connect the layers

We connect up the layers as needed. This is a simple 3-layer network:

In [5]:
net.connect("input1", "hidden1")
net.connect("hidden1", "output1")

Note:

We use the term layer here because each of these items composes the layer itself. In general though, a layer can be composed of many of these items. In that case, we call such a layer a bank.

Step 5: compile the network

Before we can do this step, we need to do two things:

  1. tell the network how to compute the error between the targets and the actual outputs
  2. tell the network how to adjust the weights when learning

Error (or loss)

The first option is called the error (or loss). There are many choices for the error function, and we’ll dive into each later. For now, we’ll just briefly mention them:

  • “mse” - mean square error
  • “mae” - mean absolute error
  • “mape” - mean absolute percentage error
  • “msle” - mean squared logarithmic error
  • “kld” - kullback leibler divergence
  • “cosine” - cosine proximity

Optimizer

The second option is called “optimizer”. Again, there are many choices, but we just briefly name them here:

  • “sgd” - Stochastic gradient descent optimizer
  • “rmsprop” - RMS Prop optimizer
  • “adagrad” - ADA gradient optimizer
  • “adadelta” - ADA delta optimizer
  • “adam” - Adam optimizer
  • “adamax” - Adamax optimizer from Adam
  • “nadam” - Nesterov Adam optimizer
  • “tfoptimizer” - a native TensorFlow optimizer

For now, we’ll just pick “mse” for the error function, and “adam” for the optimizer.

And we compile the network:

In [6]:
net.compile(error="mse", optimizer="adam")

Option: visualize the network

At this point in the steps, you can see a visual representation of the network by simply evaluating the network:

In [7]:
net
Out[7]:
XOR NetworkLayer: output1 (output) shape = (1,) Keras class = Dense activation = sigmoidoutput1Weights from hidden1 to output1 output1/kernel:0 has shape (5, 1) output1/bias:0 has shape (1,)Layer: hidden1 (hidden) shape = (5,) Keras class = Dense activation = reluhidden1Weights from input1 to hidden1 hidden1/kernel:0 has shape (2, 5) hidden1/bias:0 has shape (5,)Layer: input1 (input) shape = (2,) Keras class = Inputinput1

This is useful to see the layers and connections.

In [8]:
net.propagate([1, 0])
Out[8]:
[0.42384958267211914]

Propagating the network should show some colored squares in the layers in the above image. We can try any input vector:

In [9]:
net.propagate([0, 0])
Out[9]:
[0.5]

In these visualizations, the more red a unit is, the more negative its value, and the more black, the more positive. Values close to zero will appear white.

Interestingly, if you propagate this network with zeros, then it will only have white activations. This means that there is no activation at any node in the network. This is because the bias units are initialized at zero.

Below, we propagate small, positive values which appear as light gray. Activations in the hidden layer may appear redish (negative) or grayish (positive).

In [10]:
net
Out[10]:
XOR NetworkLayer: output1 (output) shape = (1,) Keras class = Dense activation = sigmoidoutput1Weights from hidden1 to output1 output1/kernel:0 has shape (5, 1) output1/bias:0 has shape (1,)Layer: hidden1 (hidden) shape = (5,) Keras class = Dense activation = reluhidden1Weights from input1 to hidden1 hidden1/kernel:0 has shape (2, 5) hidden1/bias:0 has shape (5,)Layer: input1 (input) shape = (2,) Keras class = Inputinput1

The dashboard

The dashboard allows you to interact, test, and generally work with your network via a GUI.

In [11]:
net.dashboard()

Step 6: setup the training data

For this little experiment, we want to train the network on our table from above. To do that, we add the inputs and the targets to the dataset, one at a time:

In [12]:
net.dataset.add([0, 0], [0])
net.dataset.add([0, 1], [1])
net.dataset.add([1, 0], [1])
net.dataset.add([1, 1], [0])
In [13]:
net.dataset.summary()
Input Summary:
   count  : 4 (4 for training, 0 for testing)
   shape  : (2,)
   range  : (0.0, 1.0)
Target Summary:
   count  : 4 (4 for training, 0 for testing)
   shape  : (1,)
   range  : (0.0, 1.0)

Step 7: train the network

In [14]:
net.train(epochs=2000, accuracy=1.0, report_rate=100)
Training...
Epoch #    0 | train error 0.23861 | train accuracy 0.00000
Epoch #  100 | train error 0.22150 | train accuracy 0.00000
Epoch #  200 | train error 0.20427 | train accuracy 0.00000
Epoch #  300 | train error 0.18981 | train accuracy 0.00000
Epoch #  400 | train error 0.17665 | train accuracy 0.00000
Epoch #  500 | train error 0.16121 | train accuracy 0.00000
Epoch #  600 | train error 0.14416 | train accuracy 0.00000
Epoch #  700 | train error 0.12708 | train accuracy 0.00000
Epoch #  800 | train error 0.10931 | train accuracy 0.00000
Epoch #  900 | train error 0.08762 | train accuracy 0.00000
Epoch # 1000 | train error 0.07012 | train accuracy 0.00000
Epoch # 1100 | train error 0.05609 | train accuracy 0.00000
Epoch # 1200 | train error 0.04508 | train accuracy 0.00000
Epoch # 1300 | train error 0.03653 | train accuracy 0.00000
Epoch # 1400 | train error 0.02990 | train accuracy 0.00000
Epoch # 1500 | train error 0.02474 | train accuracy 0.00000
Epoch # 1600 | train error 0.02068 | train accuracy 0.00000
Epoch # 1700 | train error 0.01747 | train accuracy 0.00000
Epoch # 1800 | train error 0.01489 | train accuracy 0.25000
Epoch # 1900 | train error 0.01287 | train accuracy 0.75000
========================================================================
Epoch # 2000 | train error 0.01126 | train accuracy 0.75000

Perhaps the network learned none, some, or all of the patterns. You can reset the network, and try again (retrain). Or continue with the following steps.

In [19]:
net.reset()
net.retrain(epochs=10000)
Training...
Epoch #    0 | train error 0.22751 | train accuracy 0.00000
Epoch #  100 | train error 0.19333 | train accuracy 0.00000
Epoch #  200 | train error 0.16052 | train accuracy 0.00000
Epoch #  300 | train error 0.13502 | train accuracy 0.00000
Epoch #  400 | train error 0.11283 | train accuracy 0.00000
Epoch #  500 | train error 0.09434 | train accuracy 0.00000
Epoch #  600 | train error 0.07934 | train accuracy 0.00000
Epoch #  700 | train error 0.06732 | train accuracy 0.00000
Epoch #  800 | train error 0.05761 | train accuracy 0.00000
Epoch #  900 | train error 0.04965 | train accuracy 0.00000
Epoch # 1000 | train error 0.04311 | train accuracy 0.00000
Epoch # 1100 | train error 0.03767 | train accuracy 0.25000
Epoch # 1200 | train error 0.03314 | train accuracy 0.25000
Epoch # 1300 | train error 0.02932 | train accuracy 0.25000
Epoch # 1400 | train error 0.02606 | train accuracy 0.50000
Epoch # 1500 | train error 0.02326 | train accuracy 0.50000
Epoch # 1600 | train error 0.02085 | train accuracy 0.50000
Epoch # 1700 | train error 0.01877 | train accuracy 0.50000
Epoch # 1800 | train error 0.01695 | train accuracy 0.50000
Epoch # 1900 | train error 0.01535 | train accuracy 0.50000
Epoch # 2000 | train error 0.01395 | train accuracy 0.50000
Epoch # 2100 | train error 0.01271 | train accuracy 0.50000
Epoch # 2200 | train error 0.01162 | train accuracy 0.50000
Epoch # 2300 | train error 0.01064 | train accuracy 0.50000
Epoch # 2400 | train error 0.00977 | train accuracy 0.50000
Epoch # 2500 | train error 0.00899 | train accuracy 0.50000
Epoch # 2600 | train error 0.00828 | train accuracy 0.50000
Epoch # 2700 | train error 0.00765 | train accuracy 0.50000
Epoch # 2800 | train error 0.00707 | train accuracy 0.50000
Epoch # 2900 | train error 0.00655 | train accuracy 0.50000
Epoch # 3000 | train error 0.00608 | train accuracy 0.50000
Epoch # 3100 | train error 0.00565 | train accuracy 0.50000
========================================================================
Epoch # 3132 | train error 0.00552 | train accuracy 1.00000

Step 8: test the network

In [20]:
net.test()
Testing entire dataset with tolerance=0.1...
# | inputs | targets | outputs | result
---------------------------------------
0 | [0.00,0.00] | [0.00] | [0.10] | correct
1 | [0.00,1.00] | [1.00] | [0.96] | correct
2 | [1.00,0.00] | [1.00] | [0.97] | correct
3 | [1.00,1.00] | [0.00] | [0.10] | correct
Total count: 4
      correct: 4
      incorrect: 0
Total percentage correct: 1.0

To see all of these activations flow through the network diagram above, you can run the following:

In [21]:
for pattern in net.dataset.inputs:
    net.propagate(pattern)

conx options

Propagation enhancements

There are five ways to propagate activations through the network:

  • Network.propagate(inputs) - propagate these inputs through the network
  • Network.propagate_to(inputs) - propagate these inputs to this bank (gets encoding)
  • Network.propagate_from(bank-name, activations) - propagate the activations from bank-name to outputs
  • Network.propagate_to_image(bank-name, activations, scale=SCALE)
  • Network.propagate_to_features(bank-name, activations, scale=SCALE)

Note:

All of the propagate methods will visualize their activations in any non-snapshot network image in the notebook.

In [22]:
net.propagate_from("hidden1", [0, 1, 0, 0, 1])
Out[22]:
[0.31161341]
In [23]:
net.propagate_to("hidden1", [0.5, 0.5])
Out[23]:
[0.0, 0.0, 0.0, 0.0, 0.0]
In [24]:
net.propagate_to("hidden1", [0.5, 0.5])
Out[24]:
[0.0, 0.0, 0.0, 0.0, 0.0]

There is also a propagate_to_image() that takes a bank name, and inputs.

In [25]:
net.propagate_to_image("hidden1", [0.5, 0.5]).resize((500, 100))
Out[25]:
_images/Getting_Started_with_conx_41_0.png

Plotting options

You can plot the following values from the training history:

  • “loss” - error measure (eg, “mse”, mean square error)
  • “acc” - the accuracy of the training set

You can plot any subset of the above on the same plot:

In [28]:
net.plot(["acc", "loss"])
_images/Getting_Started_with_conx_43_0.png

You can also see the activations at a particular unit, given a range of input values for two input units. Since this network has only two inputs, and one output, we can see the entire input and output ranges:

In [29]:
net["input1"].minmax = (0, 1)
net.propagate_to_plot(input_layer="input1", input_index1=0, input_index2=1,
                      output_layer="output1", output_index=0, resolution=0.05);
_images/Getting_Started_with_conx_45_0.png