Lab notes for Wednesday, April 5, 2006 Using the Conx library to create simple neural networks -------------------------------------------------------------------------- Let's train a simple neural network to solve the AND task: Input Output (AND) 0 0 0 0 1 0 1 0 0 1 1 1 First, we'll write a program to set up an appropriate network... #-------------------------------------------------------------------------- # File: net.py # load in Conx support for neural networks from ConxExtensions import * # create a basic feedforward backpropagation network n = BackpropNetwork() # add layers in the order they will be connected n.addLayer('input', 2) # input layer has two units n.addLayer('output', 1) # output layer has one unit n.connect('input', 'output') # connect the layers together # learning rate n.setEpsilon(0.5) # how often the network reports its total error during training n.setReportRate(1) # how close an output value has to be to the target to count as correct n.setTolerance(0.1) # specify the dataset to use for learning AND n.setInputs([[0, 0], [0, 1], [1, 0], [1, 1]]) n.setTargets([[0], [0], [0], [1]]) print "Network is set up" #-------------------------------------------------------------------------- Now we'll train the network on the dataset by just calling train... % idle net.py Network is set up >>> n.train() Epoch # 1 | TSS Error: 1.0132 | Correct = 0.0000 | RMS Error: 0.5033 Epoch # 2 | TSS Error: 0.8453 | Correct = 0.0000 | RMS Error: 0.4597 Epoch # 3 | TSS Error: 0.7362 | Correct = 0.0000 | RMS Error: 0.4290 . . . ---------------------------------------------------- Final # 45 | TSS Error: 0.0245 | Correct = 1.0000 | RMS Error: 0.0782 ---------------------------------------------------- After training, we can use propagate to test out individual inputs: >>> n.propagate(input=[0, 0]) [0.00092551928227623556] >>> n.propagate(input=[0, 1]) [0.086001671392928164] >>> n.propagate(input=[1, 0]) [0.084881557981294223] >>> n.propagate(input=[1, 1]) [0.90404249930777647] >>> The trained network produces outputs that correspond closely to the training targets. We can also run through the entire dataset interactively by first turning learning off and interactive mode on, and then sweeping through all of the patterns: >>> n.setLearning(0) >>> n.setInteractive(1) >>> n.sweep() To make it easier to train the network on different datasets, we put the training data in the files inputs.dat and and-targets.dat, and use the methods loadInputsFromFile and loadTargetsFromFile instead of setInputs and setTargets... #-------------------------------------------------------------------------- # File: net.py from ConxExtensions import * n = BackpropNetwork() n.addLayer('input', 2) n.addLayer('output', 1) n.connect('input', 'output') n.setEpsilon(0.5) n.setReportRate(1) n.setTolerance(0.1) # changed these lines n.loadInputsFromFile("inputs.dat") n.loadTargetsFromFile("and-targets.dat") print "Network is set up" #-------------------------------------------------------------------------- The OR task is similar to AND. Our network can learn it easily. Input Output (OR) 0 0 0 0 1 1 1 0 1 1 1 1 n.loadTargetsFromFile("or-targets.dat") >>> n.train() Epoch # 1 | TSS Error: 1.0218 | Correct = 0.0000 | RMS Error: 0.5054 Epoch # 2 | TSS Error: 0.6437 | Correct = 0.0000 | RMS Error: 0.4011 Epoch # 3 | TSS Error: 0.6014 | Correct = 0.5000 | RMS Error: 0.3877 . . . ---------------------------------------------------- Final # 35 | TSS Error: 0.0151 | Correct = 1.0000 | RMS Error: 0.0614 ---------------------------------------------------- The XOR task is harder. Our network cannot learn this task using only two layers of units. Input Output (XOR) 0 0 0 0 1 1 1 0 1 1 1 0 n.loadTargetsFromFile("xor-targets.dat") >>> n.train() Epoch # 1 | TSS Error: 1.1188 | Correct = 0.0000 | RMS Error: 0.5289 Epoch # 2 | TSS Error: 1.1276 | Correct = 0.0000 | RMS Error: 0.5309 Epoch # 3 | TSS Error: 1.1240 | Correct = 0.0000 | RMS Error: 0.5301 . . . Epoch # 4818 | TSS Error: 1.0918 | Correct = 0.0000 | RMS Error: 0.5224 Epoch # 4819 | TSS Error: 1.2915 | Correct = 0.0000 | RMS Error: 0.5682 Epoch # 4820 | TSS Error: 1.2618 | Correct = 0.0000 | RMS Error: 0.5616 . . . (interrupt by typing Control-C) In order to learn XOR, we need to add an extra layer of units to our network, called the hidden layer: n.addLayer('input', 2) n.addLayer('hidden', 2) n.addLayer('output', 1) n.connect('input', 'hidden') n.connect('hidden', 'output') >>> n.train() Epoch # 1 | TSS Error: 1.1846 | Correct = 0.0000 | RMS Error: 0.5442 Epoch # 2 | TSS Error: 1.1185 | Correct = 0.0000 | RMS Error: 0.5288 Epoch # 3 | TSS Error: 1.0997 | Correct = 0.0000 | RMS Error: 0.5243 . . . ---------------------------------------------------- Final # 175 | TSS Error: 0.0308 | Correct = 1.0000 | RMS Error: 0.0877 ---------------------------------------------------- We can use setInteractive and sweep as before to examine the behavior of the network on the input patterns. The hidden layer activation patterns corresponding to each input pattern are shown below: Input Hidden 0 0 0.74 0.00 0 1 0.04 0.08 1 0 0.04 0.08 1 1 0.00 0.86 This shows that the network has learned a new, internal hidden representation of the input patterns, in order to be able to solve the task. #-------------------------------------------------------------------------- Now let's try an auto-association task, where the network simply learns to reproduce the input patterns on the output layer, using a smaller hidden layer in the middle... Here is the file auto-inputs.dat: 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 The dataset consists of eight unique patterns. We will use a hidden layer with three units, which will force the network to learn to encode the eight input patterns using a three-dimensional space of hidden layer patterns... n.addLayer('input', 8) n.addLayer('hidden', 3) n.addLayer('output', 8) n.connect('input','hidden') n.connect('hidden','output') n.loadInputsFromFile("auto-inputs.dat") n.loadTargetsFromFile("auto-inputs.dat") >>> n.train() Epoch # 1 | TSS Error: 11.4415 | Correct = 0.0156 | RMS Error: 0.4228 Epoch # 2 | TSS Error: 7.4859 | Correct = 0.7969 | RMS Error: 0.3420 Epoch # 3 | TSS Error: 7.5817 | Correct = 0.8750 | RMS Error: 0.3442 . . . ---------------------------------------------------- Final # 177 | TSS Error: 0.0785 | Correct = 1.0000 | RMS Error: 0.0350 ---------------------------------------------------- We can save the hidden layer patterns generated by the inputs in a file: n.saveHiddenReps('auto') This creates a log file called 'auto.hiddens' and performs one sweep through the dataset, recording each hidden layer activation pattern in the log file. The resulting file looks like this: 0.035725 0.000000 0.309874 0.001861 0.332286 0.999953 0.965465 0.966408 0.000007 0.999960 0.002240 0.000715 0.995763 0.000063 0.995603 0.000233 0.522564 0.000000 0.995154 0.999652 0.998858 0.000038 0.999914 0.555374 These hidden representations correspond to points in a three-dimensional space (since each pattern is a tuple of three values). We can easily graph these eight points using the Linux program gnuplot: % gnuplot gnuplot> splot "auto.hiddens" notitle Dragging the plot with the left mouse button rotates it; dragging left/right with the middle mouse button zooms the plot in or out; dragging up/down with the middle mouse button stretches or squeezes the vertical axis. This way, we can visualize the structure of the hidden representations learned by the network, at least in the case of three-dimensional patterns. The concept extends to higher dimensional patterns in a natural way, although visualizing them is more difficult.