Neural networks are used for image processing, speech processing etc.
Neural networks are formed by automatic devices which are interconnected by synapses with associated synaptic coefficients. They enable the solution of problems which are difficult to solve by means of conventional sequential computers.
In order to carry out a given processing operation, the neural networks must learn in advance how to carry out such operations. This so-called learning phase utilizes examples where, on the basis of input data, the results to be obtained on the output are known in advance. During a first period, the neural network which has not yet been adapted to the desired task will deliver incorrect results. An error E.sup.p is then determined associated with the results obtained and those which should have been obtained and, on the basis of an adaptation criterion, the synaptic coefficients are modified in order to enable the neural network to learn the chosen example. This step is repeated for the number of examples considered to be necessary for satisfactory learning by the neural network.
A widespread method for carrying out this adaptation is the gradient back-propagation. The components of the gradient g.sub.j,L of the preceding error E.sup.p (calculated on the last layer L) are then determined for each neuron state x.sub.j,L. These components are subsequently back-propagated in the neural network, starting from its outputs, in order to determine first of all the internal components g.sub.j,l (l.noteq.L), followed by the corrections to be applied to the synaptic coefficients W.sub.ij,l of the relevant neurons. This method is described, for example in the documents:
D. E. Rumelhart, D. E. Hinton, and R. J. Williams "Learning Internal Representation by Error Propagation", in D. E. Rumelhart, and J. L. McClelland (Eds), "Parallel Distributed Processing: Exploration in the Microstructure of Cognition", Vol. 1, Foundations, MIT Press (1986). PA1 "Experiments on neural net recognition of spoken and written text", D. J. Burr, IEEE Trans. on Acoustic, speech and signal processing, Vol. 36, No. 7, July 1988, page 1162. PA1 determining the states x.sub.j,l of the neurons of a layer 1 on the basis of output potentials Y.sub.i,l-1 supplied by the neurons of the preceding layer which are connected thereto by way of synaptic coefficients W.sub.ij,l, or on the basis of input data Y.sub.i,o for the layer l=1, so that: ##EQU1## determining of the potentials Y.sub.j,l of the output neurons by application of a non-linear function F so that: EQU Y.sub.j,l =F(x.sub.j,l) PA1 l: index of the layer considered, 1.ltoreq.l.ltoreq.L PA1 j: index of the neuron in the output layer 1 PA1 i: index of the neuron in the input layer l-1 PA1 initialisation of the synaptic coefficient matrix W.sub.ij,l of the neural network, PA1 introduction of input data Y.sub.j,o.sup.p of each example p intended for learning, PA1 comparison of the results Y.sub.j,L obtained in the output layer L with the output y.sub.j.sup.p envisaged for this example p presented to the input in order to define a partial error E.sub.j.sup.p, PA1 determination of the sum E.sup.p of all partial errors E.sub.j.sup.p observed for each output neuron and for each example p, PA1 determination of the various components of the gradient g.sub.j,L =.alpha.E.sup.p /.alpha.x.sub.j,L of the error E.sup.p with respect to the states x.sub.j,L for the output layer L, PA1 carrying out the method of back propagation of the components g.sub.j,L of the gradient so that the neural network determines the components g.sub.j,l of the gradient for the other layers on the basis of the transposed synaptic coefficient matrix, PA1 determination of the subsequent variations .DELTA.x.sub.j,l, having a sign which opposes that of the corresponding component g.sub.j,l, in order to adapt the neural network, PA1 updating of the synaptic coefficj. ents on the basis of these variations .DELTA.x.sub.j,l, characterized in that for determining the subsequent variations .DELTA.x.sub.j,l of the neuron states, the method comprises a step for multiplying the components g.sub.j,l of the gradient by parameters .theta..sub.j,l in order to calculate variations .DELTA.x.sub.j,l which are proportional to -.theta..sub.j,l g.sub.j,l, where .theta..sub.j,l depends on the state of the neuron j of the layer l where .theta..sub.j,l =1 when --g.sub.j,l and x.sub.j,l have a different sign, and .theta..sub.j,l =.theta..sub.l.sup.+ when -g.sub.j,l and x.sub.j,l have the same sign, where 0.ltoreq..theta..sub.l.sup.+ .ltoreq.1. PA1 of the state x.sub.j,l of this neuron, PA1 and on the sign of the component of the gradient g.sub.j,l.
However, when such a method is carried out in a neural network, the learning periods may become very long for given applications. For example, this difficulty has been encountered in the case of the parity problem. The parity problem occurs, for example, in the case of a neural network whose inputs are linked to binary signals 1/0 and whose output is to deliver a state 1 when the number of "1" inputs is odd, and a state 0 in the opposite case. The learning problem is then due to the fact that the output state must change when a single one of the inputs changes its state, while when an even number of input state changes occurs, the output must remain unchanged.
Moreover, for example, when the neural network is used for classification problems, it may be difficult to distinguish of the classes wherebetween the minimum distance is small, because the neural network requires a long period of time for learning to differentiate between different classes. This obstructs the separation of the continuously encoded input data, particularly when some of these examples, relating to different classes, have inputs which differ only very little from one another.
The problem to be solved, therefore, consists in the reduction of the learning time of the neural network while minimizing the necessary supplementary hardware.