The removal from a neural network of weights that have only a small information content in terms of training data to be approximated considerably improves the generalization characteristic of the neural network whose dimensionality has been reduced. Furthermore, a smaller number of training data items is required to train the reduced neural network. The rate of learning and the rate of classification in the test phase are also increased.
The removal of weights from a neural network is called pruning.
Various pruning methods are known. For example, in a first prior art A. Zell, Simulation Neuronaler Netze (Simulation of neural networks), (1994), Addison-Wesley, 1st Edition, ISBN 3-89319-554-8, pp. 319-328 discloses a so-called optimal brain damage (OBD) method. In this method, the second derivatives of the error function are used on the basis of the individual weights in the neural network, in order to select those weights which should be removed. This method has the disadvantage that it operates only subject to the precondition that the training phase has converged, that is to say that the error function, which is minimized during the training phase, has reached a local minimum or a global minimum of the error function. In this case, the disadvantage of this known method is primarily that, using this method, one may in general investigate only considerably overtrained neural networks for weights to be removed.
The method likewise described in the first prior art document is subject to the same precondition of convergence in the training phase, and thus to the same disadvantages as well. This method is called the optimal brain surgeon (OBS).
Furthermore, a method is known in which the training phase is stopped before a minimum is reached in the error function. This procedure is called early stopping and is described, for example, in a second prior art W. Finnoff et al., Improving Model Selection by Nonconvergent Methods, Neural Networks, Vol. 6, (1993) pp. 771 to 783. Although the OBD method is also proposed there for assessing weights that are suitable for removal, this is only for the situation where the error function is at a minimum (page 775, penultimate paragraph).
Pruning methods which use an assessment variable that is used to describe the extent to which the value of the error function varies when a weight (w.sub.i) is removed from the neural network are disclosed in third and fourth prior art documents, R. Reed, Pruning Algorithms--A Survey, In: IEEE Transactions on Neural Networks, Vol. 4, No. 5, September 1993, pp. 740-747; and E. D. Kamin, A Simple Procedure for Pruning Back-Propagation Trained Neural Networks, In: IEEE Transactions on Neural Networks, Vol. 1, No. 2, June 1990, pp. 239-242.