Neural networks may be described as dynamical systems defined on networks of processors, where processors serve as vertices of the networks and processor pairs exchanging information serve as links. The dynamical state of a neutral network composed of processors P is generally described by quantities Fp, p=1, 2, . . . , N, called activations which account for the level of activation of the processors and by quantities Dpq, p,q=1, 2, . . . N, called weights which account for the degree of the information exchange among pairs of processors. It is generally assumed that learning in neural networks can be described by adjustment of weights which is slow compared to adjustment of activations.
Applications of neural networks to pattern recognition require neural networks to discriminate input information with regard to presence or absence of certain features. One of the most important unsolved problems in pattern recognition both within neural network approach and other approaches such as Artificial Intelligence is automatic feature extraction.
Various methods are known in the prior art for neural network feature extraction and pattern recognition. One method, known as back propagation of error, involves minimization of error functional which is the sum over squared differences between the desired and actual outputs of the output processors. The disadvantages of this method include inability to prove convergence to global extremum and slowness of the convergence due to the fact that the error functional is global, i.e., for practical systems it is a very complicated expression where each term of the sum depends on most of the activations and weights of the neural network which, in its turn, makes it very difficult to build hardware implementations of the method. In addition this method requires the knowledge of the desired output. Another means to implement pattern recognition through neural networks known in the art as the adaptive resonance model involves judicious choice of rates of adjustment of weights so that the resultant neutral network would mimic behavior of the brain. The obvious disadvantage of this method is arbitrariness in the choice of the rates of adjustment due to general lack of understanding of the physiology of the brain; and lack of predictability of the behavior of the model and difficulty with hardware implementations since the method is presented as a set of abstract differential equations. Methods and devices for information processing using analog characteristics of inputs were described in prior patent application Ser. Nos. 026,479 filed 03/16/87, and 113,636 filed 10/27/87 and in application Ser. No. 203,463 filed 6-7-88.