1. Field of the Invention
The present invention relates to signal processors, and in particular, to neural networks used in signal processors for pattern recognition.
2. Description of the Related Art
In recent years, neural networks have become more important in the implementation of intelligent and real time systems. Neural networks have a large number of simple processing modules among which processing is distributed and performed simultaneously. The large number of interconnections between the processing modules serve as the memory for the neural network.
The importance of neural networks is particularly evident in the field of pattern recognition, and still more particularly in the field of recognition of handwritten digits. Pattern recognition, and especially recognition of handwritten digits, is important in a number of fields and disciplines. As one very important example, the United States Postal Service has a critical need for a fast and highly accurate hand written digit recognition apparatus or method for automated recognition of handwritten zip codes on pieces of mail. Neural networks are particularly attractive in this area due to their self-organization, i.e., learning capabilities.
One neural network which is useful for implementing an adaptive pattern recognition network is structured according to a "competitive learning model" ("CLM") and is illustrated in FIG. 1. The CLM network 10 has two sets of nodes, or "neurons," coupled together via an adaptive weight matrix.
The first set 12 of neurons is referred to as the "F1 layer" and contains the input neurons 14a-14m. The second set 13 of neurons is referred to as the "F2 layer" and contains the output neurons 22a-22n. Each of the input neurons 14a-14m respectively receives an input signal I.sub.1 -I.sub.M. These input signals I.sub.1 -I.sub.M represent the image or pattern information sought to be recognized. For example, the input signals I.sub.1 -I.sub.M can be signals representing individual pixels from the image or pattern sought to be recognized (where M=the number of pixels).
The input subject pattern information I.sub.i is typically pixel-mapped information. The subject pattern sought to be recognized is typically first captured as a video image by a conventional video camera and frame grabber (not shown), wherein signal thresholds are used to adapt any gray level information into black or white information. This black and white information is scaled both horizontally and vertically to fit within a video frame of a preselected size. The video frame can be of virtually any size, with common sizes being 16.times.16 or 8.times.8 pixels. The horizontal scaling left justifies the pattern information and is proportioned to the vertical scaling to prevent excessive distortion of the scaled, pixel-mapped subject pattern information.
The scaled, pixel-mapped subject pattern information is "skeletonized" to eliminate unnecessary pixels within the subject pattern until only absolutely necessary subject pattern lines, i.e., its "skeleton," remain. This reduces broad lines or strokes to thin lines or strokes. This skeletonized, pixel-mapped subject pattern information is outputted as the pattern information I.sub.i.
The first input neuron 14a divides or replicates its input signal I.sub.1 into multiple pattern signals 16aa-16an for coupling to the adaptive weight matrix 18. Each pattern signal 16aa-16an is individually coupled to its own respective matrix element, or "weight," 18aa-18an. Each pattern signal 16aa-16an is weighted, e.g., multiplied, by its respective matrix element 18aa-18an to produce weighted pattern signals 20aa-20an. These weighted pattern signals 20aa-20an are coupled respectively to output neurons 22a-22n.
The input signal I.sub.1 can be an analog voltage or current, with the pattern signals 16aa-16an also being analog voltages or currents. Each of the analog pattern signals 16aa-16an can be non-scaled, i.e., equal to the value of the input signal I.sub.1 (e.g., by using voltage followers or current mirrors); or each can be scaled, e.g., have a value equal to the value of the input signal I.sub.1 divided by K, where K is a number which can be arbitrarily selected as a scaling factor. Alternatively, the input signal I.sub.1 can be a digital signal (e.g., a single binary bit), with each of the pattern signals 16aa-16an being an identically valued digital signal.
As will be recognized by one of ordinary skill in the art, if the input signal I.sub.1 and the pattern signals 16aa-16an are analog voltages or currents, the matrix elements 18aa-18an can consist of resistors or appropriately biased transistors, or combinations thereof. Such components can be coupled together by means known in the art to perform amplification or attenuation of the voltages or currents for the weighting of the pattern signals 1611-16an, as described above.
As will further be recognized by one of ordinary skill in the art, if the input signal I.sub.1 and the signals 16aa-16an are digital signals, the pattern matrix elements 18aa-18an can consist of appropriate digital logic circuits (e.g., digital adders, digital dividers). Such components can be coupled together by means known in the art to perform multiplication or division of the digital signals for the weighting of the pattern signals 16aa-16an, as described above.
The first output neuron 22a receives and sums together its weighted pattern signals 20aa-20ma to produce one output signal V.sub.1. If these weighted pattern signals 20aa-20ma are analog electrical currents, the output neuron 22a can simply consist of signals 20aa-20ma are digital bits, the output neuron 22a can be a digital adder circuit with the output signal V.sub.1 being a digital signal representing the summation thereof.
The foregoing signal processing is similarly performed with the remaining input neurons 14b-14m, matrix elements 18ba-18mn within the adaptive weight matrix 18 and output neurons 22b-22n.
In the CLM network 10 only the largest of the output neuron output signals V.sub.1 -V.sub.M is used, hence the name "competitive learning." The output neurons 22a-22n compete to produce an output signal V.sub.j to be selected for use in classifying the input pattern information as represented by the input pattern signals I.sub.1 -I.sub.M.
The matrix elements, or adaptive weights, within the adaptive weight matrix 18 which abut the winning output neuron are modified, i.e., they "learn." For example, if the "jth" output neuron 22j is the winning node, as described above, the adaptive weight vector Z.sub.j =(Z.sub.1j, Z.sub.2j, . . . , Z.sub.mj) representing the adaptive weights within the adaptive weight matrix 18 which abut the winning node 22j is modified. This modification is done to reduce the error between itself and the input pattern information signals I.sub.1 -I.sub.M (i.e., pattern signals 16aj-16mj). For example, this modification for each matrix element can be represented by the following formula: ##EQU1## where: .epsilon.=error between the adaptive weight vector Z.sub.j and the pattern signal vector X.sub.j. ##EQU2##
A CLM network 10 works well to classify, or "cluster," input patterns if the input patterns do not form too many clusters relative to the number of output neurons 22a-22n in the F2 layer 13. If the number of clusters does not exceed the number of output neurons 22a-22n, the adaptive weight matrix 18 eventually stabilizes with respect to its learning process and produces a good distribution of adaptive weight vectors for classifying input patterns.
However, a CLM network 10 does not always learn and maintain a temporally stable adaptive weight matrix 18. Changes over time in the probabilities of input patterns or in the sequencing of input patterns can "wash away" prior learning by the adaptive weight matrix 18, thereby producing memory washout.
To overcome these limitations of a CLM network, another network has been developed according to a model suggested by a theory referred to as "adaptive resonance theory" ("ART"). An embodiment of such a network is illustrated in FIG. 2.
The ART network 40, similar to the CLM network 10, has an F1 layer 42 and an F2 layer 44 coupled via an adaptive weight matrix 46. This adaptive weight matrix 46, having matrix coefficients Z.sub.ij, is referred to as a "bottom-up" adaptive weight matrix.
Input pattern information I.sub.i is received by the F1 layer 42 and is transformed to pattern information signals X.sub.i 50 which are coupled to the bottom-up adaptive weight matrix 46, a top-down adaptive weight matrix 52 (having coefficients Z.sub.ji) and a pattern signal summer 54. The F2 layer 44 receives the weighted pattern signals 56 and transforms them to output signals V.sub.j 58, as described above for the CLM network 10 in FIG. 1. These output signals 58 are coupled into a pattern classifier 60 wherein the highest valued output signal V.sub.jm is selected and used (as described below) to select pattern data corresponding thereto which is stored in a pattern memory 62. For a digital implementation the selected output signal V.sub.jm is set to a logical one and all other output signals V.sub.j are set to logical zeroes.
The pattern signals X.sub.i 50 are multiplied by the coefficients Z.sub.ji of the top-down adaptive weight matrix 52 and summed together by the top-down summer 64. The result 66 of this summation and the result 68 of the summation of the pattern signals X.sub.i 50 by the pattern signal summer 54 are coupled into a vigilance tester 70. The vigilance tester 70 divides the result 66 produced by the top-down summer 64 by the result 68 produced by the pattern signal summer 54 to produce (internally to the vigilance tester 70) a computed vigilance parameter P.sub.cjm. This computed vigilance parameter P.sub.cmj corresponds to the output neuron within the F2 layer 44 producing the highest valued output signal V.sub.jm. This is assured since the pattern classifier 60 dictates, via an interface 72, that the coefficients V.sub.ji which correspond to the output neuron producing the maximum output signal V.sub.jm are used.
The vigilance tester 70 compares the computed vigilance parameter P.sub.cjm with a reference vigilance parameter P.sub.r. If the computed vigilance parameter P.sub.cjm does not equal or exceed the reference vigilance parameter P.sub.r , the vigilance tester 70 outputs a disablement command as its F2 layer interface signal 74 to disable the output neuron within the F2 layer 44 corresponding to that particular computed vigilance parameter P.sub.cjm. The second highest output signal V.sub.j is then selected by the pattern classifier 60 and a new computed vigilance parameter P.sub.cjm is computed, using the appropriate coefficients Z.sub.ji of the top-down adaptive weight matrix 52.
When a computed vigilance parameter P.sub.cjm results which equals or exceeds the reference vigilance parameter P.sub.r, the vigilance tester 70 causes the coefficients Z.sub.ij of the bottom-up adaptive weight matrix 46 and the coefficients Z.sub.ji of the top-down adaptive weight matrix 52 to change, or "learn," in accordance with the pattern information signals X.sub.i 50 (as explained more fully below). This "learning" is effected via learning enablement signals 76, 78 outputted by the vigilance tester 70 to the bottom-up 46 and top-down 52 adaptive weight matrices.
When a computed vigilance parameter P.sub.cjm which equals or exceeds the reference vigilance parameter P.sub.r has been computed as described above, the output signal V.sub.jm corresponding thereto is selected by the pattern classifier 60 and used to select pattern data from the pattern memory 62 for outputting therefrom. This selected pattern data represents the pattern recognized as corresponding to the input pattern information I.sub.i. The vigilance tester 70 then outputs an enablement command as its F2 layer interface signal 74 to re-enable any of the output neurons within the F2 layer 44 which had been previously disabled, and the foregoing process is repeated for the next pattern.
A simplified flow chart illustrating the foregoing operational description of the ART network 40 is illustrated in FIG. 3. The first step 80 is to initialize the values of: L (a preselected convergence parameter, as discussed below); M (equal to the number of input neurons and the number of pixels representing the input pattern); N (equal to the number of output neurons and the number of patterns sought to be recognized); P.sub.r (reference vigilance parameter); Z.sub.ij (0) (bottom-up matrix coefficients at time t=0); Z.sub.ji (0) (top-down matrix coefficients at time t=0).
The value for L can be any arbitrary value greater than one, i.e., L&gt;1 (described more fully below). The values for M and N are dependent upon and selected to be equal to the numbers of pixels representing the input pattern and patterns sought to be recognized (e.g., clusters), respectively. The value for the reference vigilance parameter P.sub.r is initialized to have a value between zero and one (i.e., 0&lt;P.sub.r &lt;1) as described more fully below. The values for the top-down matrix coefficients Z.sub.ji (0) are all initialized to have a value of one (i.e., Z.sub.ji (0)=1). The values for the bottom-up matrix coefficients Z.sub.ij (0) are initialized to have values between zero and the quotient L/(L-1+M), i.e., according to the following formula: ##EQU3##
The next step 82 is to input the input pattern information I.sub.i. These values I.sub.i are inputted into the F1 layer 42 as described above.
The next step 84 is to compute matching scores, namely, the values for the signals V.sub.j outputted by the output neurons. The values for the respective outputs V.sub.j, as described above, are computed by summing the respective inputs into each one of the output neurons. Mathematically, this operational step 84 may be written according to the following formula: ##EQU4##
The next step 86 is to select the best match exemplar, namely, the highest valued output signal V.sub.jm from the output neurons. Mathematically, this step may be summarized according to the following formula: ##EQU5##
The next step 88 is to compute the vigilance parameter P.sub.cjm. This computation is performed by summing the products of the pattern signals X.sub.i and the corresponding top-down matrix coefficients Z.sub.ji and dividing the result thereof by the summation of the pattern signals X.sub.i. Mathematically, this operational step may be summarized according to the following formula: ##EQU6##
The next step 90 is to compare the computed vigilance parameter P.sub.cjm against the reference vigilance parameter P.sub.r. If the computed vigilance parameter P.sub.cjm does not equal or exceed the reference vigilance parameter P.sub.r, the output neuron corresponding to the computed vigilance parameter P.sub.cjm is disabled, i.e., its output signal V.sub.jm is temporarily set to zero (i.e., V.sub.jm =0), and the operation resumes by repeating several of the foregoing steps.
If the computed matching scores V.sub.j were stored, operation resumes with the step 86 of selecting the next best matching exemplar. If the computed matching scores V.sub.j had not been saved, operation resumes with the step 84 of re-computing the matching scores V.sub.j, while omitting the output V.sub.j from the output neuron disabled in the disabling step 92.
When the computed vigilance parameter P.sub.cjm equals or exceeds the reference vigilance parameter P.sub.r, the next step 94 is to adapt the best matching exemplar. This is done by modifying the bottom-up Z.sub.ij and top-down Z.sub.ji matrix coefficients in accordance with the pattern signals X.sub.u representing the input pattern. This modifying (i.e., "learning") is performed in accordance with the following mathematical formulas: ##EQU7##
The last step 96 is to re-enable any output neurons which may have been disabled in the disabling step 92 following the step 90 of comparing the computed vigilance parameter P.sub.cjm with the reference vigilance parameter P.sub.r. Once any disabled output neurons have been re-enabled, operation resumes with the step 82 of inputting new pattern information I.sub.i.
Summarizing, the input pattern information I.sub.i activates the F1 layer 42. The information or signals X.sub.i produced by the F1 layer 42, in turn, activate the F2 layer 44, producing an output signal V.sub.j corresponding to the output neuron therein receiving the largest total signal from the F1 layer 42, as described above. The top-down matrix coefficients Z.sub.ji corresponding to this output neuron represent a learned "expectation" with respect to the input pattern information I.sub.i. If this expectation, as represented by the computed vigilance parameter P.sub.cjm, is not met, i.e., does not equal or exceed the reference vigilance parameter P.sub.r, then that particular output neuron's signal is ignored and the remaining output neurons' signals are examined to see if a match exists. If the expectation is met, i.e., the computed vigilance parameter P.sub.cjm equals or exceeds the reference vigilance parameter P.sub.r, then the corresponding bottom-up Z.sub.ij and top-down Z.sub.ji matrix coefficients are adjusted in accordance with the input information I.sub.i, as represented by the pattern signals X.sub.i, which has thereby been found to closely match the expectation represented by the top-down matrix coefficients Z.sub.ji.
Thus, the ART network 40 allows "learning" (i.e., alteration of its adaptive weight matrices' coefficients) to occur only if the input pattern information I.sub.i is sufficiently similar to any of its learned expectations. If the input information I.sub.i is not sufficiently similar, no "learning" takes place.
However, if examination of the input information I.sub.i results in the selection of an uncommitted output neuron within the F2 layer 44, the bottom-up Z.sub.ij and top-down Z.sub.ji matrix coefficients corresponding to this previously uncommitted output neuron "learn" accordingly, as described above. Further however, if the full capacity of the ART network 40 has been exhausted, i.e., no further uncommitted output neurons exist within the F2 layer 44, and no match exists with any committed output neurons, learning is inhibited.
Thus, the ART network 40 prevents memory washout by not allowing its adaptive weight matrices to be altered unless the input pattern information closely matches the learned expectations already represented thereby. However, because the ART network 40 has a fixed reference vigilance parameter, only "perfect" input pattern information, i.e., input pattern information producing a computed vigilance parameter equaling or exceeding the reference vigilance parameter, will be recognized and allowed to instigate further learning by the adaptive weight matrices. Input pattern information which is less than "perfect" will be clustered via new, previously uncommitted output neurons until no further uncommitted output neurons exist.
An example of this operation of an ART network 40 resulting in clusters, or "exemplars," of patterns is shown in FIG. 4. Three patterns ("C," "E" and "F") inputted for recognition are shown in the left column. The resulting output patterns, after each input pattern has been applied, are shown in the right column. These output patterns resulted when an 8.times.8 video frame (i.e., 64 pixels) was used with a reference vigilance parameter P.sub.r of 0.9. This was sufficient to create separate exemplar patterns for each letter.
The network 40 correctly recognized "C," "E" and "F" as they were inputted sequentially. The fourth input pattern, a "noisy" "F" with a missing pixel in its upper line, was correctly classified as an "F." However, the fifth input pattern, another noisy "F" with a missing pixel in its left line, was considered different and a new exemplar was created. Creation of further exemplars for noisy "F" inputs will occur, leading to a growth of noisy "F" exemplars.
Therefore, although the ART network 40 prevents the memory washout associated with CLM networks, its associated problems include recognition of only "perfect" patterns and overloading of the system with excessive clustering of less than perfect input patterns. As illustrated by the example of FIG. 4, even a small amount of noise can cause serious problems, and even with no noise the reference vigilance parameter P.sub.r can be set such that two patterns which are most similar will be "recognized" as different.
Thus, it would be desirable to have a neural network capable of preventing memory washout, allowing recognition of closely matching, albeit less than perfect, input pattern information and preventing system overload by excessive clustering of non-perfect input pattern information.
Further background information on neural networks suggested by the adaptive resonance theory may be found in: "Neural Networks and Natural Intelligence," by Stephen Grossberg and Gail A. Carpenter, Library of Congress, 1988, pages 251-312; "The ART of Adaptive Pattern Recognition By a Self-Organizing Neural Network," by Gail A. Carpenter and Stephen Grossberg, I.E.E.E. Computer Magazine, March 1988, pages 77-88; and "An Introduction to Computing With Neural Nets," by R. P. Lippmann, I.E.E.E. A.S.S.P. Magazine, April 1987, pages 4-22.