Neural networks may be defined as dynamical systems defined on networks of processors, where processors serve as vertices of the networks and processor pairs exchanging information serve as links. The dynamical state of a neural network composed of processors P is generally described by quantities Fp(t), P=1, 2, . . . , N, called activations, which account for the level of activation of processors and by quantities Dpq(t), p,q=1, 2, . . . N, called weights, which account for the degree of the information exchange among pairs of processors. It is assumed that learning in neural networks can be described by adjustment of weights which may be slow compared to adjustment of activations.
Applications of neural networks to pattern recognition require neural networks to discriminate input information with regard to presence or absence of certain features. One of the most important unsolved problems in pattern recognition both within neural network approach and other approaches, such as Artificial Intelligence, is automatic feature extraction.
Various methods are known in the prior art for feature extraction and pattern recognition. One method, known as error back propagation, involves minimization of error functional which is the sum over squared differences between the desired and actual outputs of the output processors. The disadvantages of this method include inability to prove convergence to global extremum and slowness of the convergence due to the fact that the error functional is global, i.e, it is a very complicated expression where each term of the sum depends on the activations and the weights of all or large part of the neural network which, in its turn, makes it very difficult to build hardware implementations of the method. In addition this method requires the knowledge of the desired output. Another means to implement pattern recognition through neural networks known in the art as the adaptive resonance model involves judicious choice of rates of adjustment of weights so that the rates would resemble physiology of the brain. The obvious disadvantage of this method is arbitrariness in the choice of the rates and lack of predicatability of the behavior of the model and difficulty with hardware implementations since the method is presented as a set of abstract differential equations. Methods and apparatus for information processing using analog characteristics of inputs were described in prior patent application Ser. Nos. 026,479 filed 03/16/87 and 113,636 filed 10/27/87.