Using the Conx library to create simple neural networks

--------------------------------------------------------------------------

Let's train a simple neural network to solve the AND task:

Input  Output (AND)
 0 0     0
 0 1     0
 1 0     0
 1 1     1

First, we'll write a program to set up an appropriate network...

#--------------------------------------------------------------------------
# File: net.py

# load in Conx support for neural networks
from newConx import *

# create a basic feedforward backpropagation network
n = BackpropNetwork()

# add layers in the order they will be connected
n.addLayer('input', 2)        # input layer has two units
n.addLayer('output', 1)       # output layer has one unit
n.connect('input', 'output')  # connect the layers together

# learning rate
n.setEpsilon(0.5)
# how often the network reports its total error during training
n.setReportRate(1)
# how close an output value has to be to the target to count as correct
n.setTolerance(0.1)

# specify the dataset to use for learning AND
n.setInputs([[0, 0], [0, 1], [1, 0], [1, 1]])
n.setTargets([[0], [0], [0], [1]])

print "Network is set up"

#--------------------------------------------------------------------------

Now we'll train the network on the dataset by just calling train...

% idle net.py
Network is set up
>>> n.train()
Epoch #     1 | TSS Error: 1.0132 | Correct = 0.0000
Epoch #     2 | TSS Error: 0.8453 | Correct = 0.0000
Epoch #     3 | TSS Error: 0.7362 | Correct = 0.0000
  .
  .
  .

----------------------------------------------------
Final #    45 | TSS Error: 0.0245 | Correct = 1.0000
----------------------------------------------------

After training, we can use propagate to test out individual inputs:

>>> n.propagate(input=[0, 0])
[0.00092551928227623556]
>>> n.propagate(input=[0, 1])
[0.086001671392928164]
>>> n.propagate(input=[1, 0])
[0.084881557981294223]
>>> n.propagate(input=[1, 1])
[0.90404249930777647]
>>> 

The trained network produces outputs that correspond closely to the training
targets.  We can run through the entire dataset interactively like this:

>>> n.showPerformance()

To see the numerical weight values from the input layer to the output layer:

>>> n.printWeights('input', 'output')

To make it easier to train the network on different datasets, we put the
training data in the files inputs.dat and and-targets.dat, and use the methods
loadInputsFromFile and loadTargetsFromFile instead of setInputs and
setTargets...

#--------------------------------------------------------------------------
# File: net.py

from newConx import *

n = BackpropNetwork()

n.addLayer('input', 2)
n.addLayer('output', 1)
n.connect('input', 'output')

n.setEpsilon(0.5)
n.setReportRate(1)
n.setTolerance(0.1)

# changed these lines
n.loadInputsFromFile("inputs.dat")
n.loadTargetsFromFile("and-targets.dat")

print "Network is set up"

#--------------------------------------------------------------------------

The OR task is similar to AND.  Our network can learn it easily.

Input  Output (OR)
 0 0     0
 0 1     1
 1 0     1
 1 1     1

n.loadTargetsFromFile("or-targets.dat")

>>> n.train()
Epoch #     1 | TSS Error: 1.0218 | Correct = 0.0000
Epoch #     2 | TSS Error: 0.6437 | Correct = 0.0000
Epoch #     3 | TSS Error: 0.6014 | Correct = 0.5000
  .
  .
  .

----------------------------------------------------
Final #    35 | TSS Error: 0.0151 | Correct = 1.0000
----------------------------------------------------

The XOR task is harder.  Our network cannot learn this task using only two
layers of units.

Input  Output (XOR)
 0 0     0
 0 1     1
 1 0     1
 1 1     0

n.loadTargetsFromFile("xor-targets.dat")

>>> n.train()
Epoch #     1 | TSS Error: 1.1188 | Correct = 0.0000
Epoch #     2 | TSS Error: 1.1276 | Correct = 0.0000
Epoch #     3 | TSS Error: 1.1240 | Correct = 0.0000
  .
  .
  .

Epoch #  4818 | TSS Error: 1.0918 | Correct = 0.0000
Epoch #  4819 | TSS Error: 1.2915 | Correct = 0.0000
Epoch #  4820 | TSS Error: 1.2618 | Correct = 0.0000
  .
  .
  .
(interrupt by typing Control-C)

In order to learn XOR, we need to add an extra layer of units to our network,
called the hidden layer:

n.addLayer('input', 2)
n.addLayer('hidden', 2)
n.addLayer('output', 1)
n.connect('input', 'hidden')
n.connect('hidden', 'output')

>>> n.train()
Epoch #     1 | TSS Error: 1.1846 | Correct = 0.0000
Epoch #     2 | TSS Error: 1.1185 | Correct = 0.0000
Epoch #     3 | TSS Error: 1.0997 | Correct = 0.0000
  .
  .
  .
----------------------------------------------------
Final #   175 | TSS Error: 0.0308 | Correct = 1.0000
----------------------------------------------------

We can use showPerformance as before to examine the behavior of the
network on the input patterns.  The hidden layer activation patterns
corresponding to each input pattern are shown below:

Input     Hidden

 0 0     0.74 0.00
 0 1     0.04 0.08
 1 0     0.04 0.08
 1 1     0.00 0.86 

This shows that the network has learned a new, internal hidden representation
of the input patterns, in order to be able to solve the task.

#--------------------------------------------------------------------------

Now let's try an auto-association task, where the network simply learns to
reproduce the input patterns on the output layer, using a smaller hidden layer
in the middle...

Here is the file auto-inputs.dat:

1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0
0 0 0 0 1 0 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 1

The dataset consists of eight unique patterns.  We will use a hidden layer
with three units, which will force the network to learn to encode the eight
input patterns using a three-dimensional space of hidden layer patterns...

n.addLayer('input', 8)
n.addLayer('hidden', 3)
n.addLayer('output', 8)
n.connect('input','hidden')
n.connect('hidden','output')

n.loadInputsFromFile("auto-inputs.dat")
n.loadTargetsFromFile("auto-inputs.dat")

We can also view the activations and weight values graphically:

n.showActivations('input', scale=40)
n.showActivations('hidden', scale=40)
n.showActivations('output', scale=40)
n.showWeights('hidden', scale=30)

>>> n.train()
Epoch #     1 | TSS Error: 11.4415 | Correct = 0.0156
Epoch #     2 | TSS Error: 7.4859 | Correct = 0.7969
Epoch #     3 | TSS Error: 7.5817 | Correct = 0.8750
  .
  .
  .
----------------------------------------------------
Final #   177 | TSS Error: 0.0785 | Correct = 1.0000
----------------------------------------------------

To verify that the network has learned the task:
>>> n.showPerformance()

We can save the hidden layer patterns generated by the inputs in a file:

>>> n.saveHiddenReps('auto')

This creates a log file called 'auto.hiddens' and performs one sweep through
the dataset, recording each hidden layer activation pattern in the log file.
The resulting file looks like this:

0.035725 0.000000 0.309874
0.001861 0.332286 0.999953
0.965465 0.966408 0.000007
0.999960 0.002240 0.000715
0.995763 0.000063 0.995603
0.000233 0.522564 0.000000
0.995154 0.999652 0.998858
0.000038 0.999914 0.555374

These hidden representations correspond to points in a three-dimensional space
(since each pattern is a tuple of three values).  We can easily graph these
eight points using the Linux program gnuplot:

% gnuplot
gnuplot> splot "auto.hiddens" notitle

Dragging the plot with the left mouse button rotates it; dragging left/right
with the middle mouse button zooms the plot in or out; dragging up/down with
the middle mouse button stretches or squeezes the vertical axis.

This way, we can visualize the structure of the hidden representations learned
by the network, at least in the case of three-dimensional patterns.  The
concept extends to higher dimensional patterns in a natural way, although
visualizing them is more difficult.