The Super self-adjusting back-propagation algorithm was developed by Tollenaere. As in delta-bar-delta there is an eta for every weight and each one is increased until the slope changes and then the decay factor decreases the learning rate. My experience so far shows that it is not as fast as quickprop with standard problems however at this stage in neural networking research this is not the final verdict. It seems like it may be better with recurrent networks than quickprop when the momentum is small (~0.1) and the size of the eta is limited to around 1-5. I did see a posting on the net that SuperSAB was "among the best" algorithms in another research report.
The parameters are:
ss a 1.05 * the acceleration parameter ss d 0.5 * the decay parameter ss e 0.5 * the eta value ss M 30 * the maximum value for the eta for a weight ss m 0.9 * the momentum parameter
This momentum parameter is not the same variable as the momentum parameter for gradient descent (alpha). Eta should probably be around 1 / n to 10 / n where n is the number of patterns. The above a and d values are said to be good in general however on small problems like xor a value of 1.5 for a will get faster convergence. The maximum eta for any weight, M is best set lower than this default value because allowing a value as large as 30 will often wreck the calculations (you get NaNs).
The parameters can all go on one line as in:
ss a 1.05 d 0.5 e 0.5 M 30 m 0.9 * SuperSAB parameters