Using the Conx library to create simple neural networks -------------------------------------------------------------------------- Let's train a simple neural network to solve the AND task: Input Output (AND) 0 0 0 0 1 0 1 0 0 1 1 1 First, we'll write a program to set up an appropriate network... #-------------------------------------------------------------------------- # File: net.py # load in Conx support for neural networks from newConx import * # create a basic feedforward backpropagation network n = BackpropNetwork() # add layers in the order they will be connected n.addLayer('input', 2) # input layer has two units n.addLayer('output', 1) # output layer has one unit n.connect('input', 'output') # connect the layers together # learning rate n.setEpsilon(0.5) # how often the network reports its total error during training n.setReportRate(1) # how close an output value has to be to the target to count as correct n.setTolerance(0.1) # specify the dataset to use for learning AND n.setInputs([[0, 0], [0, 1], [1, 0], [1, 1]]) n.setTargets([[0], [0], [0], [1]]) print "Network is set up" #-------------------------------------------------------------------------- Now we'll train the network on the dataset by just calling train... % idle net.py Network is set up >>> n.train() Epoch # 1 | TSS Error: 1.0132 | Correct = 0.0000 Epoch # 2 | TSS Error: 0.8453 | Correct = 0.0000 Epoch # 3 | TSS Error: 0.7362 | Correct = 0.0000 . . . ---------------------------------------------------- Final # 45 | TSS Error: 0.0245 | Correct = 1.0000 ---------------------------------------------------- After training, we can use propagate to test out individual inputs: >>> n.propagate(input=[0, 0]) [0.00092551928227623556] >>> n.propagate(input=[0, 1]) [0.086001671392928164] >>> n.propagate(input=[1, 0]) [0.084881557981294223] >>> n.propagate(input=[1, 1]) [0.90404249930777647] >>> The trained network produces outputs that correspond closely to the training targets. We can run through the entire dataset interactively like this: >>> n.showPerformance() To see the numerical weight values from the input layer to the output layer: >>> n.printWeights('input', 'output') To make it easier to train the network on different datasets, we put the training data in the files inputs.dat and and-targets.dat, and use the methods loadInputsFromFile and loadTargetsFromFile instead of setInputs and setTargets... #-------------------------------------------------------------------------- # File: net.py from newConx import * n = BackpropNetwork() n.addLayer('input', 2) n.addLayer('output', 1) n.connect('input', 'output') n.setEpsilon(0.5) n.setReportRate(1) n.setTolerance(0.1) # changed these lines n.loadInputsFromFile("inputs.dat") n.loadTargetsFromFile("and-targets.dat") print "Network is set up" #-------------------------------------------------------------------------- The OR task is similar to AND. Our network can learn it easily. Input Output (OR) 0 0 0 0 1 1 1 0 1 1 1 1 n.loadTargetsFromFile("or-targets.dat") >>> n.train() Epoch # 1 | TSS Error: 1.0218 | Correct = 0.0000 Epoch # 2 | TSS Error: 0.6437 | Correct = 0.0000 Epoch # 3 | TSS Error: 0.6014 | Correct = 0.5000 . . . ---------------------------------------------------- Final # 35 | TSS Error: 0.0151 | Correct = 1.0000 ---------------------------------------------------- The XOR task is harder. Our network cannot learn this task using only two layers of units. Input Output (XOR) 0 0 0 0 1 1 1 0 1 1 1 0 n.loadTargetsFromFile("xor-targets.dat") >>> n.train() Epoch # 1 | TSS Error: 1.1188 | Correct = 0.0000 Epoch # 2 | TSS Error: 1.1276 | Correct = 0.0000 Epoch # 3 | TSS Error: 1.1240 | Correct = 0.0000 . . . Epoch # 4818 | TSS Error: 1.0918 | Correct = 0.0000 Epoch # 4819 | TSS Error: 1.2915 | Correct = 0.0000 Epoch # 4820 | TSS Error: 1.2618 | Correct = 0.0000 . . . (interrupt by typing Control-C) In order to learn XOR, we need to add an extra layer of units to our network, called the hidden layer: n.addLayer('input', 2) n.addLayer('hidden', 2) n.addLayer('output', 1) n.connect('input', 'hidden') n.connect('hidden', 'output') >>> n.train() Epoch # 1 | TSS Error: 1.1846 | Correct = 0.0000 Epoch # 2 | TSS Error: 1.1185 | Correct = 0.0000 Epoch # 3 | TSS Error: 1.0997 | Correct = 0.0000 . . . ---------------------------------------------------- Final # 175 | TSS Error: 0.0308 | Correct = 1.0000 ---------------------------------------------------- We can use showPerformance as before to examine the behavior of the network on the input patterns. The hidden layer activation patterns corresponding to each input pattern are shown below: Input Hidden 0 0 0.74 0.00 0 1 0.04 0.08 1 0 0.04 0.08 1 1 0.00 0.86 This shows that the network has learned a new, internal hidden representation of the input patterns, in order to be able to solve the task. #-------------------------------------------------------------------------- Now let's try an auto-association task, where the network simply learns to reproduce the input patterns on the output layer, using a smaller hidden layer in the middle... Here is the file auto-inputs.dat: 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 The dataset consists of eight unique patterns. We will use a hidden layer with three units, which will force the network to learn to encode the eight input patterns using a three-dimensional space of hidden layer patterns... n.addLayer('input', 8) n.addLayer('hidden', 3) n.addLayer('output', 8) n.connect('input','hidden') n.connect('hidden','output') n.loadInputsFromFile("auto-inputs.dat") n.loadTargetsFromFile("auto-inputs.dat") We can also view the activations and weight values graphically: n.showActivations('input', scale=40) n.showActivations('hidden', scale=40) n.showActivations('output', scale=40) n.showWeights('hidden', scale=30) >>> n.train() Epoch # 1 | TSS Error: 11.4415 | Correct = 0.0156 Epoch # 2 | TSS Error: 7.4859 | Correct = 0.7969 Epoch # 3 | TSS Error: 7.5817 | Correct = 0.8750 . . . ---------------------------------------------------- Final # 177 | TSS Error: 0.0785 | Correct = 1.0000 ---------------------------------------------------- To verify that the network has learned the task: >>> n.showPerformance() We can save the hidden layer patterns generated by the inputs in a file: >>> n.saveHiddenReps('auto') This creates a log file called 'auto.hiddens' and performs one sweep through the dataset, recording each hidden layer activation pattern in the log file. The resulting file looks like this: 0.035725 0.000000 0.309874 0.001861 0.332286 0.999953 0.965465 0.966408 0.000007 0.999960 0.002240 0.000715 0.995763 0.000063 0.995603 0.000233 0.522564 0.000000 0.995154 0.999652 0.998858 0.000038 0.999914 0.555374 These hidden representations correspond to points in a three-dimensional space (since each pattern is a tuple of three values). We can easily graph these eight points using the Linux program gnuplot: % gnuplot gnuplot> splot "auto.hiddens" notitle Dragging the plot with the left mouse button rotates it; dragging left/right with the middle mouse button zooms the plot in or out; dragging up/down with the middle mouse button stretches or squeezes the vertical axis. This way, we can visualize the structure of the hidden representations learned by the network, at least in the case of three-dimensional patterns. The concept extends to higher dimensional patterns in a natural way, although visualizing them is more difficult.