Adaptive Resonance Theory (ART) architectures are neural networks that carry out stable self-organization of recognition codes for arbitrary sequences of input patterns. Adaptive Resonance Theory first emerged from an analysis of the instabilities inherent in feedforward adaptive coding structures (Grossberg, 1976a). More recent work has led to the development of two classes of ART neural network architectures, specified as systems of differential equations. The first class, ART 1, self-organizes recognition categories for arbitrary sequences of binary input patterns (Carpenter and Grossberg, 1987a and U.S. patent application Ser. No. 07/086,732, filed Jul. 23, 1987). A second class, ART 2, does the same for either binary or analog inputs (Carpenter and Grossberg, 1987b and U.S. Pat. No. 4,914,708).
Both ART 1 and ART 2 use a maximally compressed, or choice, pattern recognition code. Such a code is a limiting case of the partially compressed recognition codes that are typically used in explanations by ART of biological data (Grossberg, 1982a, 1987). Partially compressed recognition codes have been mathematically analysed in models for competitive learning, also called self-organizing feature maps, which are incorporated into ART models as part of their bottom-up dynamics (Grossberg 1976a, 1982a; Kohonen, 1984). Maximally compressed codes were used in ART 1 and ART 2 to enable a rigorous analysis to be made of how the bottom-up and top-down dynamics of ART systems can be joined together in a real-time self-organizing system capable of learning a stable pattern recognition code in response to an arbitrary sequence of input patterns. These results provide a computational foundation for designing ART systems capable of stably learning partially compressed recognition codes. The present invention contributes to such a design.
The main elements of a typical ART 1 module are illustrated in FIG. 1. F.sub.1 and F.sub.2 are fields of network nodes. An input is initially represented as a pattern of activity across the nodes of feature representation field F.sub.1. The pattern of activity across category representation F.sub.2 corresponds to the category representation. Because patterns of activity in both fields may persist after input offset (termination of the input) yet may also be quickly inhibited, these patterns are called short term memory, or STM, representations. The two fields, linked by bottom-up adaptive filter 22 and top-down adaptive filter 24, constitute the Attentional Subsystem. Because the connection weights defining the adaptive filters may be modified by inputs and may persist for very long times after input offset, these connection weights are called long term memory, or LTM, variables.
Each node of F.sub.1 is coupled to each node of F.sub.2 through a weighted connection in the adaptive filter 22. Those weights change with learning. Thus, selection of a category node in F.sub.2 is determined by the nodes which are activated by an input pattern and the weights from those nodes to F.sub.2. Each node of F.sub.2 is in turn connected to each node of F.sub.1 through weighted connections of the adaptive filter 24. Those weights are also learned. The learned weights define a template pattern from a selected category, and that pattern is received at the nodes of F.sub.1 through the adaptive filter 24. Intersection of the input pattern from input 20 and the template through the adaptive filter 24 is activated as a matching pattern in F.sub.1. The norm of the matching pattern is compared to the norm of the input pattern at 26. If the comparison exceeds a threshold vigilance parameter .rho., the system is allowed to resonate and the adaptive filters 22 and 24 adjust their weights in accordance with the matching pattern. On the other hand, if the comparison does not exceed the vigilance parameter threshold, F.sub.2 is reset and a different category is selected. Prior to receiving the template pattern through the adaptive filter 24, a gain control gain 1 activates all nodes which receive the input pattern. This is an implementation of the 2/3 Rule. Similarly, gain 2 activates F.sub.2. Offset of the input pattern triggers offset of gain 2 and causes rapid decay of short term memory at F.sub.2. F.sub.2 is thereby prepared to encode the next input pattern without bias.
FIG. 2 illustrates a typical ART search cycle. An input pattern I at 20 (FIG. 1) registers itself as a pattern X of activity across F.sub.1 (FIG. 2a). The F.sub.1 output signal vector S is then transmitted through the multiple converging and diverging weighted adaptive filter pathways 22 emanating from F.sub.1, sending a net input signal vector T to F.sub.2. The internal competitive dynamics of F.sub.2 contrast-enhance T. The F.sub.2 activity vector Y therefore registers a compressed representation of the filtered F.sub.1 .fwdarw.F.sub.2 input and corresponds to a category representation for the input active at F.sub.1. Vector Y generates a signal vector U that is sent top-down through the second adaptive filter 24, giving rise to a net top-down signal vector V to F.sub.1 (FIG. 2b). F.sub.1 now receives two input vectors, I and V. An ART system is designed to carry out a matching process whereby the original activity pattern X due to input pattern I may be modified by the template pattern V that is associated with the current activity category. If I and V are not sufficiently similar according to a matching criterion established by a dimensionless vigilance parameter .rho. at 26, a reset signal quickly and enduringly shuts off the active category representation (FIG. 2c), allowing a new category to become active. Search ensues (FIG. 2d ) until either an adequate match is made or a new category is established.
In earlier treatments (e.g., Carpenter and Grossberg, 1987a), we proposed that the enduring shut-off of erroneous category representations by a nonspecific reset signal could occur at F.sub.2 if F.sub.2 were organized as a gated dipole field, whose dynamics depend on depletable transmitter gates. Though the new search process does not use a gated dipole field, it does retain and extend the core idea that transmitter dynamics can enable a robust search process when appropriately embedded in an ART system.
FIG. 3 shows the principal elements of a typical ART 2 module. It shares many characteristics of the ART 1 module, having both an input representation field F.sub.1 and a category representation field F.sub.2, as well as Attentional and Orienting Subsystems. FIG. 3 also illustrates one of the main differences between the examples of ART 1 and ART 2 modules so far explicitly developed; namely, the ART 2 examples all have three processing layers within the F.sub.1 field. These three processing layers allow the ART 2 system to stably categorize sequences of analog input patterns that can, in general, by arbitrarily close to one another. Unlike in models such as back propagation, this category learning process is stable even in the fast learning situation, in which the LTM variables are allowed to go to equilibrium on each learning trial.
In FIG. 3, one F.sub.1 layers w,x reads in the bottom-up input, one layer p,q reads in the top-down filtered input from F.sub.2, and a middle layer v,u matches patterns from the top and bottom layers before sending a composite pattern back through the F.sub.1 feedback loop. Each of the nodes i of F.sub.1 in ART 1 is now replaced by a set of nodes w.sub.i, x.sub.i, v.sub.i, u.sub.i, p.sub.i and q.sub.i. Like sets of those nodes span F.sub.1 for processing of elements of the incoming pattern from an input stage I. Each of the large circles of FIG. 3 represents the computation of the L.sub.2 -norm of all the signals of a particular subfield, such as all of the signals w.sub.i across the F.sub.1 field. Each of the smaller circles denotes a computation to generate each of the subfield signals. Both F.sub.1 and F.sub.2 are shunting competitive networks that contrast-enhance and normalize their activation patterns (Grossberg, 1982a).