1. Field of the Invention
The present invention relates to the technology of recognizing a pattern of characters, etc. by selecting features through efficiently reducing the number of dimensions of feature vectors indicating a pattern.
2. Description of the Related Art
Recently, document recognizing technology for electronically filing documents, efficiently following an office work flow, and encoding data as necessary, is earnestly demanded. Particularly, character recognizing technology, that is, an aspect of document recognizing technology, is essential for encoding character string information. In this technology, a method of quickly estimating the type of characters with the recognition precision maintained is required to put the character recognizing technology into practical use in various fields. The method of selecting features by reducing the number of dimensions of the character vectors of input characters using statistical technology is effective in reducing the amount of computation for collation with a recognition dictionary. Therefore, the character recognizing technology using the feature selection method functions as an important element when a practical document recognizing apparatus is produced. It is also essential for generating a device for recognizing various patterns other than characters.
Described first below is a common concept of character recognition with characters as an example of a pattern.
First, when a character pattern is input, its size is normalized.
Then, a rectangular character area obtained by the normalization is divided into plural blocks forming the rectangular character area. For example, a single rectangular character area is equally divided into 9 blocks in 3 rows by 3 columns or 36 blocks in 6 rows by 6 columns.
Next, picture elements indicating the contour of a character (contour picture elements) existing in each block are extracted. For each picture element, the direction of the contour containing the picture element is determined. The contour picture element is a picture element corresponding directly to a character area, or can be obtained by processing a character area in a fine-line process. The above described direction can be one of 8 directions (up, down, left, right, and 4 diagonal directions), or one of further detailed 36 directions. Then, the number of the contour picture elements is obtained for each direction in each block. As a result, a partial feature vector which has the number of dimensions equal to the number of directions and whose element value corresponds to the number of contour picture elements in the direction of the element can be obtained for each block. For each input character pattern, a feature vector comprising all elements of each partial feature vector corresponding to each block contained in a corresponding rectangular character area can be obtained.
If the feature vector for each of the thus obtained character patterns is classified into the types of object characters, then a cluster is formed for each character type by grouping feature vectors of character patterns of the same character type in a multiple-dimensional space having the number of dimensions corresponding to the number of elements of the feature vector. Based on this characteristic, the feature vector of a learning character pattern is classified, and an average feature vector representing the type of the character corresponding to each of the resultant clusters is computed from the feature vector contained in the cluster. The average feature vector is computed by calculating an average value for each element of a feature vector. The average feature vector for each character type is entered in a dictionary.
When a character is actually recognized, the feature vector of an input character pattern is computed, and then the distance between the feature vector and each of the average feature vectors entered in the dictionary is computed. Thus, the character type corresponding to the average feature vector indicating the shortest distance is estimated to be the type of the input character. The distance can be a euclidean distance or a city block distance.
When the distance is computed for the entire feature space, a length of time corresponding to the number of dimensions of the feature space is required. A well-know method of performing such computation at a high speed is to compute the distance after reducing the number of dimensions of the feature space (for example, reducing from 384 dimensions to 64 dimensions). The method of reducing the number of dimensions in a feature space is referred to as feature selection. A concrete method for the feature selection can be a canonical determination analysis or a major component analysis. It is certain from experiments that the recognition ratio is hardly reduced with only about one-eighth of the original number of dimensions through the feature selection. On the contrary, the recognition ratio can also be enhanced by reducing noisy feature vectors not desired in the feature selection.
First, the conventional technology of recognizing a character through a canonical determination analysis for feature selection is described below by referring to the configuration shown in FIG. 1.
A feature extracting unit 101 extracts n.sub.i sets of learning feature vectors x.sub.j.sup.(i) (1.ltoreq.j.ltoreq.n.sub.i) represented by the following equation 1, as described above from n.sub.i samples of character patterns contained in each character type i (1.ltoreq.i.ltoreq.g) of g types. The superscript `.sup.T ` indicates the transposition of a matrix (or vector). EQU x.sub.j.sup.(i) =(x.sub.jk.sup.(i))=(x.sub.j1.sup.(i), . . . , x.sub.jN.sup.(i)).sup.T ( 1)
The subscript k indicates the element number of a feature vector in the range of 1.ltoreq.k.ltoreq.N.
In the following description, an underlined symbol indicates the amount of a vector, and a symbol having an element number without an underline as described above indicates an element value of a vector.
A learning unit 102 computes an average feature vector m.sup.(i) represented by the following equation 2, corresponding to each character type i (1.ltoreq.i.ltoreq.g) of g types from the above described feature vector x.sub.j.sup.(i) (1.ltoreq.j.ltoreq.n.sub.i) corresponding to the character type i. EQU m.sup.(i) =(m.sub.k.sup.(i))=(m.sub.1.sup.(i), . . . ,m.sub.N.sup.(i)).sup.T( 2)
The learning unit 102 computes an average feature vector (entire average feature vector) m for all character types represented by the following equation 3 from the number n.sub.i of samples for each character type i and the above described average feature vector m.sup.(i). EQU m=(m.sub.k)=(m.sub.1, . . . , m.sub.N).sup.T ( 3)
Then, the learning unit 102 computes the inter-character-type variance matrix S.sub.b and the inner-character-type variance matrix S.sub.w based on the feature vector x.sub.j.sup.(i) for each character type i, number of samples n.sub.i, the above described average feature vector m.sup.(i), and the entire average feature vector m as indicated by the following equations 4 through 7. The subscripts p and q indicate element numbers of the feature vector in the range of 1.ltoreq.p, q.ltoreq.N. ##EQU1##
Furthermore, the learning unit 102 satisfies the following equation 8 using the above described inter-character-type variance matrix S.sub.b and inner-character-type variance matrix S.sub.w, and computes N sets of characteristic vectors .phi..sub.k (each dimension is N with a length of 1) and a set of eigenvalues .lambda..sub.k (1.ltoreq.k.ltoreq.N). EQU S.sub.b .phi..sub.k =.lambda..sub.k S.sub.w .phi..sub.k (1.ltoreq.k.ltoreq.N) .lambda..sub.1 .gtoreq..lambda..sub.2 .gtoreq. . . . .gtoreq..lambda..sub.N ( 8)
Next, the learning unit 102 selects larger M (M&lt;N) characteristic vectors .phi..sub.h (1.ltoreq.h.ltoreq.M) from among the eigenvalues .lambda..sub.k corresponding to the computed N characteristic vectors .phi..sub.k, and stores them in a characteristic vector storage unit 103.
The learning unit 102 computes an M-dimensional average selection feature vector m.sup.(i)' for each character type i by computing the inner product of the above described M characteristic vectors .phi..sub.h (1.ltoreq.h.ltoreq.M) and the N-dimensional average feature vector m.sup.(i) for each character type i as represented by the following equation 9. Then, it stores them in a recognition dictionary unit 104. EQU m.sup.(i)' =(m.sub.h.sup.(i)')=(.phi..sub.1.sup.T m.sup.(i),.phi..sub.2.sup.T m.sup.(i), . . . , .phi..sub.M.sup.T m.sup.(i))(9)
where the subscript h indicates an element number of a selection feature vector in the range of 1.ltoreq.h.ltoreq.M. Thus, a feature selection process is performed by reducing N dimensions into M dimensions for an average feature vector for each character type stored in the dictionary. This feature selection process is equal to the process of projecting an N-dimensional average feature vector on M coordinate axes defined with M characteristic vectors .phi..sub.h so that the inter-character-type variance can be expanded and the inner-character-type variance can be reduced, that is, so that different types of characters can be separated and the same type of characters can be collected in a space after the feature selection prescribed by M characteristic vectors .phi..sub.h. That is, in the canonical determination analysis, all clusters in the original feature space corresponding to all object character types are converted into a new space.
When a character is actually recognized, the feature extracting unit 101 extracts the N-dimensional feature vector x represented by the following equation 10 from an input character pattern whose character type is unknown. EQU x=(x.sub.k)=(x.sub.1, . . . , x.sub.N).sup.T ( 10)
where the subscript k indicates the element number of the feature vector before the feature selection in the range of 1.ltoreq.k.ltoreq.N.
A feature selecting unit 105 computes the M-dimensional selection feature vector y by computing the inner product of M characteristic vectors .phi..sub.h (1.ltoreq.h.ltoreq.M) and the N-dimensional feature vector x stored by the characteristic vector storage unit 103 by the following equation 11. EQU y=(y.sub.h)=(.phi..sub.1.sup.T x, .phi..sub.2.sup.T x, . . . , .phi..sub.M.sup.T x) (11)
Thus, the feature selection process is performed by reducing N dimensions into M dimensions for an input feature vector.
Finally, a collating unit 106 computes, for each character type i, each euclidean distance d.sup.(i) between the M-dimensional selection feature vector y and each average selection feature vector m.sup.(i)' stored by the recognition dictionary 104 using the following equation 12. ##EQU2##
Then, the collating unit 106 outputs the character type i corresponding to the average selection feature vector m.sup.(i)' having the shortest distance d.sup.(i) as an estimated character type.
According to the above described conventional technology using the canonical determination analysis as feature selection, the number of elements used in computing the distance is reduced from N terms to M terms. Therefore, the recognition speed can be greatly increased by setting M approximately to one-eighth of N.
However, according to the conventional technology using the canonical determination analysis as feature selection, it is not guaranteed that M characteristic vectors .phi..sub.h (1.ltoreq.h.ltoreq.M) are orthogonal to one another. Therefore, if a new feature space is defined based on these characteristic vectors .phi..sub.h, the feature vector x of an object character is projected on M coordinate axes corresponding to the above described M characteristic vectors .phi..sub.h, and if the euclidean distance between the projection result, that is, the selection feature vector y, and an average selection feature vector m.sup.(i)' for each character type i is calculated, then the distance may be considerably different from the distance in the original N-dimensional feature space.
For comprehensibility, assume that the number of dimensions before feature selection is 3 and the number of dimensions after the feature selection is 2 as shown in FIG. 2.
The euclidean distance d.sub.org between the feature vector x.sub.2 and the feature vector x.sub.1 in the 3-dimensional feature space before the feature selection is obtained by the following equation 13. EQU d.sub.org =.parallel.x.sub.2 -x.sub.1 .parallel. {(a.sub.1.sup.T x.sub.2 -a.sub.1.sup.T x.sub.1).sup.2 +(a.sub.2.sup.T x.sub.2 -a.sub.2.sup.T x.sub.1).sup.2+( a.sub.3.sup.T x.sub.2 -a.sub.3.sup.T x.sub.1).sup.2 }.sup.1/2 ( 13)
The approximate euclidean distance d.sub.new between the feature vector x.sub.2 and the feature vector x.sub.1 in the 2-dimensional feature space after the feature selection is obtained by the following equation 14. EQU d.sub.new ={(.phi..sub.1.sup.T x.sub.2 -.phi..sub.1.sup.T x.sub.1).sup.2 +(.phi..sub.2.sup.T x.sub.2 -.phi..sub.2.sup.T x.sub.1).sup.2 }.sup.1/2( 14)
In equation 14 above, the physical quantity in each term on the right-hand side is shown in FIG. 2. As shown in FIG. 2, equation 14 is not based on the Pythagorean theorem. The euclidean distance d.sub.new in the 2-dimensional feature space after the feature selection is quite different from the euclidean distance d.sub.org in the 3-dimensional feature space before the feature selection.
According to the conventional technology using the canonical determination analysis as feature selection, the combination of the projection of the feature vector x with M characteristic vectors .phi..sub.h (1.ltoreq.h.ltoreq.M) obtained by the canonical determination analysis does not refer to the projection of the feature vector x on the partial eigenspace for the original feature space, and thereby a degradation in precision occurs in recognizing a character. Therefore, it is very hard to realize a character recognizing apparatus, etc. having practical recognition precision.
Using the main component analysis, that is, another method for feature selection, a set of main component vectors with which each character type is separate from each other is computed for each character type (a set of main component vectors for each character type). This analysis method is used, not to classify characters for the purpose of determining plural types of characters, but to obtain a correct recognition result by projecting a feature vector of an object character to the main component vector corresponding to each character type when similar character types exist in areas close to each other in a feature space. That is, in the main component analysis, a new individual space indicating a main component vector as a coordinate axis for each character type is generated. Since the distance between an object character and each character type is computed after the feature vector of the object character is projected on the main component vector of each character type, a large amount of computation is required when a large number of character types are involved. Therefore, the analysis method is used mainly to, for example, determine characters such as numerical characters, characters of a small number of types, etc.
The definition of the distance for the feature vector relates to a Mahalanobis distance and a Bayes distance. When a recognizing apparatus is designed using these distances, the distance is computed after placing limitations in such a way that the coordinate axes for use in computing the distance may be orthogonal to each other. Therefore, the problems with the canonical determination analysis do not arise with this method, but there is the problem that the computation is complicated and a large amount of computation is required.
The above described problems are not necessarily related only to a character recognizing unit, but are common to the technology for recognizing various patterns such as image patterns, voice patterns, etc. through feature vectors.