The present invention relates to a pattern recognition method or scheme using a dictionary in various types of pattern recognition, including character reading and, more particularly, to a method of composing a dictionary for use in the case of employing a distance function as a discriminant function and a pattern recognition method using the dictionary.
A brief description will be given of the most commonly used procedure for handprinted or handwritten character recognition which is a typical example of pattern recognition. For each of characters of all classes (referred to also as categories) which are likely to be used (0 through 9 when the characters to be recognized are numerals, for example), a number of handprinted characters are gathered or collected as learning character patterns (also called training patterns or learning samples) and features of respective training patterns in each class are extracted and each expressed as an M-dimensional vector x=(x.sub.1, . . . , x.sub.M). Next, calculations are made of, for example, means .mu..sub.1 =x.sub.1, .mu..sub.2 =x.sub.2, . . . , .mu..sub.M =x.sub.M of corresponding components of the feature vectors of all training patterns in each class, and the average vectors .mu.=.mu..sub.1, . . . , .mu..sub.M) thus obtained are used as reference pattern vectors. In this way, such a reference pattern vector is predetermined for every class of characters.
To recognize an arbitrary handwritten character pattern, its feature vector x=(x.sub.1, . . . , x.sub.M) is obtained first, then the distance between the feature vector x and the reference pattern vector .mu. of every class to be recognized is calculated using a distance function as a discriminant function, and a character of the class closest to the feature vector x is selected and output as a recognition result for the input pattern. A variety of parameters have been proposed to express features of character patterns, but as long as they are represented by vectors, the principles of the present invention have nothing to do with the kinds of feature parameters used and the way of determining them. The present invention rather concerns how the distance function which defines the distance between the feature vector of the input pattern and the reference pattern vector should be modified to increase the character recognition accuracy.
The character recognition utilizes the distance function as the discriminant function in many cases. There are known, as distance functions, a Euclidean distance, a weighted Euclidean distance, a quadratic discriminant function (or Bayesian discriminant function), and a modified quadratic discriminant function and a projected distance, for instance. To provide increased recognition accuracy in the case of using the distance function, it is customary to faithfully represent the distribution of features (x.sub.1, . . . , x.sub.M) in each class of the characters to be read. In contrast to the Euclidean distance, the weighted Euclidean distance utilizes, as the weight, the inverse of the variance of corresponding feature components of the respective classes, i.e., the corresponding components of the feature vectors. The weighted Euclidean distance provides higher recognition accuracy than does the Euclidean distance. The quadratic discriminant function, the modified quadratic discriminant function and the projected distance utilize a covariance matrix of features in each class and its eigenvector and eigenvalue for discrimination, and hence provide high accuracy even if the features are correlated. However, these methods have technical limitations and cannot be expected to achieve higher accuracy.
Another important viewpoint to increase the accuracy of character recognition is to emphasize differences of each class from the others. One possible method that has been proposed to implement it is to modify the distance between the input pattern and each class, which is obtained when the discrimination of the input pattern is made using the distance function, so that the distance between the input pattern and the class to which the input pattern belongs is short, whereas the distance between the input pattern and the class to which it does not belong is long.
To make such a modification to the distance between the input pattern and each class, it is necessary to employ a function which produces negative values for patterns belonging to the class concerned and positive values for patterns not belonging to the class. To this end, it is possible to use such a method as disclosed in a literature [Kawatani, et al., "Improvement of Discriminant Function by Superposition of Distance Function and Linear Discriminant Function," '89 Autumn National Conference of the Institute of Electronics, Information and Communication Engineers of Japan, D-166, pp. 6-166 (1989)].
According to this method, the weighted Euclidean distance, the quadratic discriminant function or the like is used as the distance function,and a pattern set of each noticed class and a misread or nearly-misread pattern set (a rival pattern set) for the noticed class which are obtained by the discrimination of training patterns with the original distance function, are subjected to a discriminant analysis using either the difference between each component of the feature vector of the training pattern and the corresponding component of the reference pattern vector or the square of the difference as a variable, whereby an intended function (called a discriminant function) is obtained. In the actual discrimination, the distance value of each class obtained with the original distance function is added to a value obtained with the discriminant function for each class, and the resulting added value is used to determine the class of the input pattern. A method for obtaining the rival patterns will be described later.
In this conventional method, the value obtained with the discriminant function for each class is added to each distance obtained with the original distance function to thereby produce the same effect as that by the modification or correction of the weight vector of the original distance function or the reference pattern vector. That is, when the difference between each component of the feature vector of the training pattern and the corresponding component of the reference pattern vector is used as a variable in the discriminant analysis, the reference pattern vector is corrected, and when the square of the difference is used, the weight vector is corrected.
With this method, however, the weight vector and the reference pattern vector are corrected independently of each other; therefore, they cannot be corrected in the optimum combination--this imposes limitations on improvement of the recognition capability.