1. Field of the Invention
The present invention relates to a parallel associative memory. Such a parallel associative memory can be used in systems for data retrieval or for pattern recognition by selecting one or more stored patterns that best match an input pattern. In particular, the present invention relates to a memory which stores a large number of patterns and carries out the selection operation in a parallel manner. The parallel nature of the memory increases the operational speed and efficiency of the memory. The increased speed broadens the field of the memory's application.
2. Description of Related Art
In an associative memory, a data pattern is compared to a plurality of stored patterns and an exact match is sought. The pattern acts as an address to an additional portion of the memory where information relating to the stored pattern is stored. The result is the retrieval of data associated with the data pattern.
In pattern recognition systems, a plurality of known patterns are stored in a storage medium and unidentified observed patterns are compared to those plurality of known patterns that are stored. If an unidentified observed pattern matches one of the stored patterns then an identification occurs. If there is no exact match for the unidentified observed pattern then the recognition system is usually required to find the "best" match for the unidentified observed pattern from the stored patterns. The recognition system usually employs similarity measures or distance measures to determine the "best" match. The identification of the unidentified observed pattern with an exact match or the selection of a "best match" associates the observed pattern with a label or a class identification that corresponds to the label or class identification of the exact or "best" match of the observed pattern.
Associative memories and pattern recognition systems have the following potential applications. The systems and memories can be used in module diagnosis where it is desired to determine a particular fault that is associated with a particular failure of a board of a module. A second potential application is in optical character recognition. In optical character recognition one desires to quickly recognize particular characters from existing character fonts in order to input an entire page of text through a scanner and in order to be able to later edit this same page. A third potential application is for speech recognition where one desires to quickly associate a sound pattern to particular phonics in a manner that is independent of the speaker of the sound pattern. A fourth potential application is in chemical or biological laboratory situations. In these situations it may be desired to identify a chemical from some particular measurements or to identify a virus from some blood or tissue samples for example. A fifth potential application is in Data Base systems. In these systems it is desired to recall useful information even when a recall key is in error. These are all possible applications for a recognition system or an associative memory, however this is not an exhaustive list of potential applications.
Prior associative memories and pattern recognition systems consist of either general purpose processors, which are slow when used for pattern recognition, or very fast special purpose processors which are only capable of use in a narrow set of applications. Moreover, associative memories generally are designed for fixed size patterns, perform exact matching, have limited storage capacity and are fairly expensive.
In many of the above-identified applications there are many possible patterns which can be used to represent a particular classification with which an observed pattern may be associated. The number of known patterns necessary for identification of an unidentified pattern determines the storage requirement of the system or memory. In order to reduce the potential storage requirement, it is necessary to attempt to determine one or more "typical" patterns that are representative of the available classifications. This can be done using parametric statistical techniques (if large samples of representative patterns have been collected) or by non-parametric approaches (applicable when only small samples of representative patterns are available). The procedures that attempt to find "typical" patterns are really procedures that divide or partition the known patterns into groups or classes. The partitioning process has been studied extensively in the literature under various topics including Pattern Recognition, Cluster Analysis, Decision Theory, Neural Networks, and Parametric and Non-Parametric Statistics.
Once "typical" patterns are selected, these "typical" patterns can represent groups or classes of other similar patterns. Ideally classes are disjoint. The ability to label a group of patterns is an important task in associative memory and pattern recognition applications. As an example, a pattern which was observed but not previously identified as belonging to a particular class and is similar to a group of already stored patterns can be identified as belonging to that particular group or class of patterns.
The process of identifying an observed pattern or retrieving data associated with a pattern and associating it with a particular class or group of patterns can be broken down into two stages, a first stage called the storage or partition stage and a second stage called the recall or recognition/classification stage.
In the storage or partitioning stage, data is collected by obtaining observed patterns. The data is manipulated to get pattern features into a desired form for pattern recognition. Then, "typical" patterns are selected in accordance with a particular partitioning algorithm. This algorithm may be as simple as using the first pattern in each pattern class or group as the "typical" pattern or it may be as complex as using an averaging procedure for identifying a "typical" pattern from the plurality of patterns associated with a given class or group of patterns. Once the "typical" patterns are selected the second stage of the identification process can be initiated.
In the recall stage (for data retrieval) or recognition/classification stage, an unidentified observed pattern is obtained. This unidentified observed pattern is then compared against the "typical" patterns which have been produced in the storage or partitioning stage. Based upon the comparison of the unidentified observed pattern to these "typical" patterns, a nearest pattern class to which this observed pattern belongs or the k nearest pattern classes can be identified.
There are a large number of procedures for selecting "typical" patterns and for partitioning the pattern space. A large number of these procedures use an Euclidian measure of distance which can be shown to be similar to a Hamming distance. Hamming distance is a common metric distance in the field of pattern recognition.
The problem with most of these pattern recognition systems and associative memories is the extent of the computational efforts involved in identifying an observed pattern. The most significant aspect of these computational efforts involves the amount of effort required for obtaining the distance measurements. These can be broken down into three major areas that consume time and effort. The first major area is obtaining the "typical" patterns representative of the various classes or groups of patterns. The second major area is the partitioning of the typical patterns for storage. A third major area is the actual recalling or recognition process.
An example of a prior pattern recognition system is disclosed in U.S. Pat. No. 4,326,259 entitled "Self Organizing General Pattern Class Separator and Identifier" by Leon Cooper. This system has limited capabilities as its implementation in hardware would be limited to a particular adaptive recognition algorithm which the system supports.
A threshold or radius of attraction has been used for deciding whether an observed pattern is a member of a predefined class. G. S. Sebestyen published adaptive pattern recognition algorithms in 1962 and the algorithms used the concept of class thresholds. See G. S. Sebestyen, Decision-Making Processes in Pattern Recognition, The Macmillian Company, New York (1962).
In chapter 4 of Sebestyen's book, he describes several approximate and adaptive techniques of classification which are said to inherently result in rapid processing and updating of input information. A simple algorithm for constructing decision regions is introduced in which the region is constructed by storing representative samples of the classes. Classification of a new input is based on distance to the nearest stored sample. An input is said to belong to class A because it is closer to at least one typical member of class A than to any of the typical members of class B. In a refinement of this method described on pages 97-98, a finite number of given samples are used in a decision procedure that approximates a likelihood ratio computation, since it bases decisions on a local majority rule; an input is said to belong to A if, within a radius r, more A than B type samples were observed in the past.
Sebestyen further describes a number of machines that represent and store knowledge about classes by selectively sampling the set of known members. The selected samples are distributed in the same way as the set of known members and they cover the same region of the vector space. Classificatory decisions are based on the distance of the input to the closest elected sample. It is said that this procedure is inherently fast and can be readily implemented by a machine that performs sequential comparisons with stored samples. An adaptive technique of performing selective sampling is described that uses input information as it occurs and as it is influenced by the new sample, it updates the stored information concerning the nature of the decision regions. A cascaded or multilayered machine is also described in which learned concepts are introduced as new dimensions of the pattern space.
On pages 119-121 Sebestyen discloses that an often used procedure in pattern recognition is to correlate an input vector to be classified with each of several stored references that represent the different classes to which the input may belong. The stored references are often the means of the set of samples of the different classes. Decisions are made by comparing the correlation between the input and each of the stored references and by deciding that the input is a member of the class corresponding to the largest correlation coefficient. Another method of making classifications is based on a comparison of the Euclidian distance between the input and the stored references. Although decisions based on maximum correlation and minimum Euclidian distance are very similar, the similarity ceases when thresholds are used to avoid decision making in uncertain cases.
The prior pattern recognition systems fail to provide the combined advantages of speed and flexibility.