There are several ways to input and output patterns, numbers and other values and there is one format command, `f', that is used to set these options. In the format command a number of options can be given on a single line as for example in:
f b+ ir oc wB
A special input character for compressed and real patterns is the letter 'x', the unknown, its default value is 0.5 but it can be changed in the "Value of x in Patterns" entry box or it can be typed in as in:
f x -1 * sets x to -1 when the pattern is read
The programs are able to read pattern values in two different formats. Real numbers follow the C language notation and must be separated by a space. The letters `H' and 'h' used in recurrent networks is also allowed. The letter `x' (the unknown) with a default value of 0.5 is also allowed. Real input format is now the default.
The other format is the compressed format, a format consisting of 1s, 0s and the letters `x', 'h' and `H'. In compressed format each value is one character and it is not necessary to have blanks between the characters. For example, in compressed format the patterns for xor could be written out as:
101 000 011 110Clicking the "Input Format" button will change the format to its other value. The typed commands are:
f ir * input real f ic * input compressed
In most applications you present the network with a single input pattern and you get a single output pattern, however if you have a series of inputs over time and you use a recurrent network and you want the network to classify the type of the input then you are more interested in getting answers based on the series of patterns than on each individual pattern. To get this type of result you must tell the program how many "minor patterns" there are for each "major pattern". Enter this value in the entry box.
The typed command to set this value to 128 for example is "f r 128". To undo this set the number of minor patterns to 1 with "f r 1".
To get the network to print out results based on a series of patterns you must also set the summary flag to r by: "f sr". To undo this use "f s+" to get the regular statistics based on minor patterns or use "f s-" to suppress the summary altogether.
There are two different types of problems that back-propagation can handle, the general type of problem where every output unit can take on an arbitrary value and the classification type of problem where the goal is to turn on output unit i and turn off all the other output units when the pattern is of class i. The xor problem is an example of the general type of problem. For an example of a classification problem, suppose you have a number of data points scattered about through three-dimensional space and you have to classify the points as either class 1, class 2 or class 3. For a pattern of class 1 you can always set up the output: "1 0 0", for class 2: "0 1 0" and for class 3: "0 0 1", however doing the translation to bit patterns can be annoying so another notation can be used. Instead of specifying the bit patterns click the "Problem Type" button to classification and then the program will read data in the form:
1.33 3.61 1 * shorthand for 1 0 0 0.42 -2.30 2 * shorthand for 0 1 0 -0.31 4.30 3 * shorthand for 0 0 1and translate it to the bit string form. Another benefit of the classification format is that when the program outputs a status line it will also include the percentage of correct patterns based on the maximum value rather than just on tolerance.
The typed commands are:
f pg * the general type of problem f pc * the classification type of problem
When you do a classification problem the program makes the target output unit = 1 and all the other targets = 0 and this is fine for the standard sigmoid however if you use tanh or some other function other than the standard sigmoid you might want to try targets of 1 and -1. To change these values enter BOTH of them in the entry box. Or the typed command looks like:
f t -1 1
THE VALUES HAVE TO BE SET BEFORE THE DATA FILES ARE READ IN.
When you are reading commands from a file it is sometimes worthwhile to see those commands echoed on the screen, this is especially true if there is some kind of error in the text. There is a button that will toggle this on and off or the typed commands are:
f e+ * echo on f e- * echo off
Output format is controlled with the `f' command as in:
f or * output node values using real (the C %f) format f oc * output node values using compressed format f oa * output node values using analog compressed format f oe * output values with e notation f os * output values scaled up to their natural valuesThe first sets the output to real numbers. The second sets the output to be compressed mode where the value printed will be a `1' when the unit value is greater than 1.0 - tolerance, a `^' when the value is above 0.5 but less than 1.0 - tolerance, a `v' when the value is less than 0.5 but greater than the tolerance. Below the tolerance value a `0' is printed. The tolerance can be changed using the `t' or `tpu' command (not a part of the format command). For example, to make all values greater than 0.8 print as `1' and all values less than 0.2 print as `0' use:
t 0.2Of course this same tolerance value is also used to check to see if all the patterns have converged. The third output format is meant to give "analog compressed" output. In this format a `c' (c for close) is printed when a value is close enough to its target value. Otherwise, if the answer is close to 1, a `1' is printed, if the answer is close to 0, a `0' is printed, if the answer is above the target but not close to 1, a `^' is printed and if the answer is below the target but not close to 0, a `v' is printed. This output format is designed for problems where the output is a real number, as for instance, when the problem is to make a network learn sin(x). The format "e" writes out node values using exponential notation with four places to the right of the decimal point.
The "fos" command scales output values up to their natural values, the values they had before the scaling program re-scaled them. At the moment this causes a problem with the way the program checks for the end of training (sigh).
When pattern values are printed out you can insert a blank between the compressed output format values or a carriage return between standard real output values. This makes the output more readable. For instance, you may have 24 output units where it makes sense to insert blanks after the 4th, 7th and 19th positions. To do this you can type in the main window entry box:
f B 4 7 19Then the output of pattern values will look like:
1 10^0 10^ ^000v00000v0 01000 e 0.17577 2 1010 01v 0^0000v00000 ^1000 e 0.16341 3 0101 10^ 00^00v00000v 00001 e 0.16887 4 0100 0^0 000^00000v00 00^00 e 0.19880The break option allows up to 20 break positions to be specified. Besides typing in the command you can edit the the string of values given in the entry box.
When an error is encountered the program produces an error message, the default is to write to the canvas area of the main window and pop up a message box window. Use the "Form of Error Messages" button to get a menu with the choices, "Window", "Teletype" or "Both". The typed commands are:
f Ew * to the message box window only f Et * to the canvas area only (as in a teletype) f Eb * both
f P 30 * set the page size to 30
Normally whenever you run more training iterations the message, "running . . ." prints out to reassure you that something is in fact being done, however this can also be annoying at times. To get rid of this message click the "Print Running ..." button to give "No" in the entry box or click it again to get the running message. The typed commands are:
f R+ * turn on the printing of "running ..." f R- * turn off the printing of "running ..."
When the program is learning patterns you normally want to have it print out the status of the learning process at regular intervals. If you want to skip these status reports use the "Summarize Status when Training" button to turn it off. The typed commands are:
f s+ * print the summary f s- * skip the summary
During the ith pass thru the network the program will collect statistics on how many patterns are correct and how much error there is. It does this so that it will know when to stop the training. But it gets these numbers BEFORE the weights are changed in the ith pass. In the case of periodic update methods (the periodic, delta-bar-delta and quickprop methods) this is not much of a problem, when the program does the ith pass if the patterns meet the desired tolerance it simply skips the weight update portion and reports that the problem was finished on the i-1st pass. Also when the program goes to print the status of the training it makes an extra forward pass to get the status of the data. The numbers reported here are always up-to-date.
On the other hand with the "right" and "wrong" continuous update methods you change weights immediately so if pattern i is correct now a change to the weights for pattern i+1 may make pattern i wrong. The only way to get the correct up-to-date statistics is to make an extra forward pass after every pass through all the training patterns. Using up-to-date statistics will do this. There is a button to toggle this on and off or use the typed commands:
f u+ * make the extra forward pass to get up-to-date f u- * skip the extra forward pass
Things are even worse, there is another option in the program related to statistics and continuous updates called the "off by 1 option" which you might as well leave alone so it isn't even listed under the format options.
Weights can be saved in several formats. First the r format saves the weights in an ASCII file, second the R format saves the weights and other weight parameters in an ASCII file, third, b saves the weights in a binary format and finally B saves the weights and other weight parameters in a binary format. The commands are:
f w r f w R f w b f w B
The great virtue of the binary formats is that you can reload EXACTLY the same values, writing values out as ASCII is liable to change them slightly.
IF YOU'RE GOING TO RESTART A PROBLEM FROM EXACTLY ITS CURRENT STATE AT A LATER DATE YOU MUST USE THE R OR B FORMAT NOT THE r AND b FORMAT because the extra weight parameters are necessary for the smooth functioning of the training algorithms.
The program can count the number of weight files written and assign each one a unique name. If the name of the weights files is "weights", the weights will be saved in files named "weights.1", "weights.2", "weights.3" and so on. This numbering capability is turned on with:
f W+ and turned off with:
f W-
To start the renumbering from 0 you can turn "Number Weights Files" on again.
Within a benchmarking run only one weights file is produced for each network, that is whenever a new best set is found it overwrites the current best set instead of creating a new file.