The normal way to operate this program is to have the training data in one file and if there is test set data it should be in another file. For a function approximation program with say five inputs and 1 output the data should look like:
-1.588 -1.650 0.365 0.188 0.962 -1.543 -1.182 -0.926 0.992 0.188 1.140 -1.372 -2.650 -1.650 3.501 0.188 0.566 -1.201where the first five numbers are the input and the last number is the answer. If there are a large number of inputs you can use more than one line for each pattern but you MUST start every new pattern on a new line.
NOTE: when doing this type of problem where the outputs can be outside the range 0 to 1 you need to make the output layer activation function the linear one. Do this in the Algorithm (A) menu window. The hidden layer activation function should be the standard sigmoid.
For a plain classification program, with say four inputs and 3 possible classes the data file will look like:
0.40 0.30 -0.33 0.21 1 * this is a class 1 pattern 0.55 0.32 -0.09 0.20 2 * this is a class 2 pattern 0.11 0.23 -0.97 0.45 3 * this is a class 3 patternwhere the first four numbers are the input and the last number is the class number. If there are a large number of inputs you can use more than one line for each pattern but you MUST start every new pattern on a new line. The asterisk at the end of the line begins a comment.
NOTE: reading in patterns in this format requires setting the classification format flag. This can be set with the Input (I), Format (F) menu or the Pattern (P) menu window.
For a recurrent network problem, with say four inputs and 4 possible outputs the data file will look like:
1 0 0 0 H 0 0 1 0 0 0 1 0 H 0 1 0 0 0 1 0 0 H 0 0 1 0 0 0 1 0 H 0 0 0 1where the first four numbers are the input, the H stands for the values of the hidden layer units that are copied down to the input and the last four numbers are the output. If there are a large number of inputs you can use more than one line for each pattern but you MUST start every new pattern on a new line.
The use of H to stand for all the hidden layer units is convenient because then you can change the number of hidden layer units without changing the pattern files. On the other hand if you want to take a limited number of hidden layer values and copy them down to input there is another notation that can be used. For instance there is this data:
0.00000 h 0.15636 0.15636 h 0.30887 0.30887 h 0.45378 0.45378 h 0.58753 0.58753 h 0.70683 0.70683 h 0.80874 0.80874 h 0.89075 0.89075 h 0.95086a series used to predict the next value of sin(x), so given 0.0 you want the network to output the next value 0.15636, then give 0.15636 you want to output 0.30887. The single h in each pattern stands for "take one (the first) hidden layer value and use it as the second input value".
There is an additional page with more background on recurrent networks.
To make a network select the Network (N) menu window. The options there are to make a two, three or four layer network. (In fact you can make a network with any number of layers by typing in the right command but four layers is rarely useful and more than four is very rarely done so there are no menu entries for making more than a four layer network).
Whichever size network you choose fill in the entry boxes with the number of units you want in each layer. Then if you want direct input to output connections in a three or four layer network click the button that changes the setting. Likewise if you want a recurrent network click that button.
IF THE NETWORK IS A RECURRENT NETWORK AND USES "H" DO NOT include the number of short term memory units when you input the number of input units if you are using H to stand for all the hidden layer units. Thus for the poetry problem tell the program you want 25 input units (not 45, the 45 comes from 25 normal input units plus the 20 more short term units whose values come from the hidden layer). The Tcl/Tk program will ultimately output a make command that looks like "m 25+20 20 25" so you will end up with 45 input units for the network.
IF THE NETWORK IS A RECURRENT NETWORK AND USES "h" you do count however many h units you have as input units and you DO NOT click on the recurrent network button. So if you use this data:
0.00000 hh 0.15636 0.15636 hh 0.30887 0.30887 hh 0.45378 0.45378 hh 0.58753 0.58753 hh 0.70683 0.70683 hh 0.80874 0.80874 hh 0.89075 0.89075 hh 0.95086the number of input units should be 3 and you DO NOT click on the recurrent network button.
(OK, I should change the label on the button to something more clear.)
To finally make the network, click the "Make" button at the bottom of the window or click "Cancel" to exit the window without making a network.
AFTER making the network you can go to the Patterns (P) or Input (I) menu window to read in the patterns. When you select a button there a list box comes up with the files in the current directory and you double click the file you want.
Note that if you make a network again with a different number of hidden layer units the patterns will be lost (they are attached to the network structure and not saved) so you must read them in again (or buy the pro version where they are saved).
There are many variations on backprop algorithm that are normally faster than the original plain backprop algorithm. The best of these in this program is usually the Quickprop algorithm however there is no guarantee that it will be the best algorithm sometimes the other algorithms will be better. When you are using Quickprop (see the Q menu window), Delta-Bar-Delta (D) or the periodic update algorithms (the G menu window, G for Gradient descent) you need to use an eta that is about 1/n where n is the number of patterns, you must set these parameters yourself it is not done automatically and the default settings are just there because something has to be there.
In the D, Q and G menu windows you can turn on the algorithm but to get some other algorithm you must go to the A menu window.
Having initialized the network you can run the training algorithm by typing "r" and a carriage return in the main window entry box or click the "r" button on the menu bar or click "Run" in the T menu window. The initial default is to run 100 passes through the training set data and print the status of the patterns every 10 iterations. For the sonar data included in the sample data the listing will look like:
10 49.04 % 49.04 % 0.47063 62.50 % 62.50 % 0.38221 20 70.19 % 73.08 % 0.38548 77.88 % 77.88 % 0.38063 30 76.92 % 76.92 % 0.34943 77.88 % 80.77 % 0.33282The first column is the iteration number, the second column gives the percentage of training set patterns right based on the tolerance the third column gives the percentage right based on the maximum value, the fourth column gives the abs (not RMS!) error. The fifth column is the percentage of test set patterns right based on the tolerance, the sixth column gives the percentage right based on maximum value and the last column gives the abs error.
Once you have all the parameters set you can save the make a network command and all the parameters in a command file and the weights in another file (default name: weights). To save everything select the "Save As ..." button or the "Save and Exit" button in the File menu or the "Save Everything" button in the O (Output) menu window. In all cases you will be asked to give a file name to save the commands to, the weights will be saved to the current weights file. Of course, "Save and Exit" also ends the program. Saving everything is also available in the O (Output) menu window.
To quit the program you can type "q" in the main window entry box or quit from the File menu.