The present invention relates to pattern recognition and in particular relates to a pattern recognition method and apparatus that uses partitioned categories to provide an improved recognition accuracy.
Pattern recognition is the process by which a hand written or printed pattern is converted to a pattern signal representing the pattern, a determination is made of what pattern the pattern signal represents, and a code indicating the determined pattern is generated. For example, pattern representing an unknown character, such as the letter xe2x80x9cA,xe2x80x9d may be scanned by an electronic scanner. The scanner generates a pattern signal composed of array of a few thousand bytes that represent the unknown character. The signal representing the bit map is then analyzed to determine the identity of the unknown character represented by the pattern and to generate a code identifying the character. For example, the ASCII code 65 representing the letter A may be generated.
Most pattern recognition devices can only recognize pattern signals representing patterns that are members of a predetermined pattern set. For example, a simple pattern recognition device may only be capable of recognizing pattern signals representing the numerals 0 through 9. A more complex pattern recognition device may be capable recognizing pattern signals representing the letters and numerals that are members of a pattern set corresponding to the ASCII character set. Pattern recognition devices typically include a recognition dictionary divided into categories, one for each of the patterns in the pattern set that the device is capable of recognizing. The recognition dictionary stores a typical pattern signal for each category. For example, the above-mentioned simple pattern recognition device may include a recognition dictionary divided into ten categories, one for each of the numerals 0 through 9. The recognition dictionary stores a typical pattern signal for the numeral corresponding to each category. To recognize the pattern represented by an unknown pattern signal, the pattern recognition device performs a matching operation between the unknown pattern signal and the pattern signals stored in the recognition dictionary. The matching operation determines which of the pattern signals stored in the recognition dictionary is closest to the unknown pattern signal. The unknown pattern signal is then deemed to represent the pattern corresponding to the category whose typical pattern signal is closest to the unknown pattern signal.
Many pattern recognition devices reduce the complexity of the matching operation by expressing the unknown pattern signal and the typical pattern signals stored in the recognition dictionary as vectors. To prevent confusion, a vector extracted from an unknown pattern signal will be called a feature vector, and a vector representing a pattern signal stored in the recognition dictionary will be called a reference vector.
In many pattern recognition devices, the matching operation involves determining the distance between the unknown pattern signal and the typical pattern signal of each of the categories by matching the feature vector extracted from the unknown pattern signal with the reference vector of each category stored in the recognition dictionary. The unknown pattern signal is deemed to belong to the category for which the distance is the smallest. Known functions for representing the distance between two vectors include the Euclidean distance, the weighted Euclidean distance, and the Mahalanobis distance.
To improve the accuracy of the pattern recognition operation performed by a pattern recognition device, the pattern recognition device may be trained using known patterns of the same style as the unknown pattern. For example, a pattern recognition device for handwriting recognition may require the writer to write a set of known patterns or characters for training purposes. Alternatively, training may be performed using handwritten training pattern sets written by many writers.
One way of training the distance function is called Learning by Discriminant Analysis (LDA), and is described by the inventor in Handprinted Numeral Recognition by Learning Distance Function, IEICE TRANS. D-II, vol. J76-D-11, 9, 1851-59 (1993) (in Japanese). LDA trains such parameters of the distance function as the reference vector, weighting vector, and the constant term by using a weighted Euclidean distance or a quadratic discriminant function as an original distance function and by superposing a discriminant function onto the original distance function. The discriminant function is determined by applying linear discriminant analysis between the pattern set of each category and a rival pattern set for the category. The rival pattern set for each category is composed of patterns that are mis-recognized as belonging to the category when the original distance function is used.
LDA features using not only the linear term of the feature vector, but also using the quadratic term as the linear terms in a linear discriminant. Test results show that this type of training provides a marked improvement in recognition accuracy. Using the weighted Euclidean distance, a recognition accuracy comparable to those obtained using the Mahalanobis distance function or using a quadratic discriminant function was achieved. The Mahalanobis distance function and the quadratic discriminant function both require a great deal of processing and memory to implement, whereas the Euclidian distance requires little memory and has a short processing time.
Satisfactory as present results are, the recognition accuracy of pattern recognition devices using LDA is not perfect. Improving the recognition accuracy of pattern recognition devices that use LDA involves increasing the discrimination ability of the discriminant function to separate a given category pattern set from its rival pattern set. The discriminant function is expressed by a quadratic equation of each component of the feature vector in which the products between the components are omitted. Consequently, the boundary between the given category pattern set and its rival pattern set, that is, the locus where the value of the discriminant function is 0, lies parallel to the feature axis and lies in a quadratic plane symmetric with respect to the center, as shown in FIG. 1.
It can be seen from FIG. 1 that, although the category pattern set 1 of the category lies safely within the boundary 2 defined by the quadratic function, the spacing between the boundary and the perimeter of the category pattern set is such that the boundary encompasses small parts of the rival pattern sets 3 of the category. The discriminant function has a positive value outside the boundary, a negative value inside the boundary, and a value of zero at the boundary. The polarity of the discriminant function inside and outside the boundary 2 is indicated by the circled xe2x88x92 and the circled +, respectively, in FIG. 1. A pattern belonging to the part of the rival pattern set encompassed by the boundary will have a smaller distance from the category after the discriminant function is superposed onto the original function than the distance obtained using the original function. The pattern may therefore be incorrectly recognized as belonging to the category.
In pattern recognition, the category pattern set of each category often has an asymmetric distribution in feature space. Moreover, since the patterns that belong to the rival pattern set of a category usually belong to many categories, the rival patterns are considered to enclose an asymmetric pattern set of the given category in feature space. When the category pattern set and the rival pattern set of a given category have the distribution described above, completely separating these sets by a symmetric quadratic plane is difficult, and the unknown patterns located near the periphery of the pattern pattern set are easily mis-recognized.
Accordingly, an inability to adequately handle the asymmetry of the pattern distribution imposes limitations on improving the recognition accuracy. Therefore, what is needed is an apparatus and method that can handle asymmetric pattern distributions so that a high recognition accuracy can be attained notwithstanding asymmetric pattern distributions.
The invention provides a pattern recognition method that determines the category of an unknown pattern. The category is one of a set of categories corresponding to a set of known patterns. In the method, a subcategory-level recognition dictionary is provided that stores reference information for each one of plural subcategories obtained by partitioning the categories constituting the category set. A pattern signal representing the unknown pattern is received and is processed to extract a feature vector from it. The reference information of one subcategory of each category in the recognition dictionary is selected from the recognition dictionary in response to the feature vector. Finally, a distance between the feature vector and the reference information of the subcategory of each category selected in the selecting step is determined to determine the category of the unknown pattern.
The subcategory-level recognition dictionary is generated in response to feature vectors extracted from training pattern signals representing respective training patterns. The recognition dictionary stores reference information for each one of plural categories constituting a category set. In the method, each category is partitioned into 2n subcategories, where n is an integer greater than zero, and learning by discriminant analysis is applied to each subcategory to generate reference information for the subcategory.
The invention further provides a pattern recognition apparatus that determines a category of an unknown pattern represented by a feature vector extracted from a pattern signal. The category is one of a set of categories corresponding to a set of known patterns. The apparatus comprises a subcategory-level recognition dictionary, a selecting module, a distance determining module and a module that determines the category of the unknown pattern. The subcategory-level recognition dictionary stores reference information for each one of plural subcategories obtained by partitioning the categories constituting the category set. The selecting module selects from the recognition dictionary the reference information of one subcategory of each category in response to the feature vector. The distance determining module determines a distance between the feature vector and the reference information of the subcategory selected by the selecting module for each of the categories. The module that determines the category of the unknown pattern makes this determination from the distance generated by the distance determining module for each of the categories.
Finally, the invention provides an apparatus for generating a subcategory-level recognition dictionary from feature vectors extracted from training pattern signals representing training patterns to generate reference information for each of plural categories constituting a category set. The apparatus comprises the partitioning section that partitions each category into 2n subcategories, where n is an integer greater than zero, and a section that applies learning by discriminant analysis to each subcategory defined by the partitioning section to generate the reference information for the subcategory.