1. Field of the Invention
The present invention relates to an apparatus for recognizing the patterns of characters and the like included in information of a recognition target and a method thereof.
2. Description of the Related Art
A conventional typical character recognition method is as follows. First, a feature is extracted from an inputted character pattern as a set of numeric values, that is, a vector by a specific predetermined method. In other words, an input pattern is mapped to a point in a feature vector space by feature extraction. Then, the distance between the feature vector of this input pattern and the representative point of each category in the vector space is calculated and the closest category is designated as a recognition result.
The representative point of each category in the vector space is the average in the vector space of sample pattern feature vectors prepared for each category. For the distance scale, a city block distance, a Euclid distance and the like are used.
“Hand-written Kanji/Hiragana Recognition by a Weighted Directional Index Histogram Method” by Tsuruoka et al. (Paper Journal D of The Institute of Electronics, Information and Communication Engineers, Vol. J70-D, No. 7, pp.1390–1397, July 1987) proposes a method using modified Bayes discriminant function that reflects the distribution in a feature vector space of each category, instead of a simple distance.
This method is obtained by modifying Bayes discriminant function, which is an optimal discriminant function when sample patterns are based on a normal distribution and both an average and a covariance matrix are already known, so as to solve theoretical and implementation problems. In this case, the problems are that the higher an order of an eigenvector of the covariance matrix, the bigger an estimation error, that a huge calculation amount and a huge memory capacity are needed, and the like. If the (n-dimensional) feature vector of an input pattern is assumed to be x, a Bayes discriminant function fc(x) for a category C and a modified Bayes discriminant function gc(x) are defined as follows.(1) Bayes Discriminant Function                                           f            c                    ⁡                      (            x            )                          =                                                            (                                  x                  -                                      m                    c                                                  )                            t                        ⁢                                                            ∑                                      -                    1                                                  c                            ⁢                              (                                  x                  -                                      m                    c                                                  )                                              +                      log            ⁢                                                        ∑                c                                                                                      (        1        )                mc: Average vector of category C    Σc: Covariance matrix of category C(2) Modified Bayes Discriminant Function                                           g            c                    ⁡                      (            x            )                          =                                            1                              α                c                                  k                  +                  1                                                      ⁢                          {                                                                                                              x                      -                                              m                        c                                                                                                  2                                -                                                      ∑                                          i                      =                      1                                        k                                    ⁢                                                            (                                              1                        -                                                                              α                            c                                                          k                              +                              1                                                                                                            α                            c                            i                                                                                              )                                        ⁢                                                                  (                                                                              (                                                          x                              -                                                              m                                c                                                                                      )                                                    ·                                                      v                            c                            i                                                                          )                                            2                                                                                  }                                +                      log            ⁡                          (                                                ∏                                      i                    =                    1                                    k                                ⁢                                                      α                    c                    i                                    ·                                                            ∏                                              i                        =                                                  k                          +                          1                                                                    n                                        ⁢                                                                                  ⁢                                          α                      c                                              k                        +                        1                                                                                                        )                                                          (        2        )                αci: i-th eigenvalue of Σc     vci: Eigenvector corresponding to the i-th eigenvalue of Σc     k: Integer between 1 and n, including 1 and n
However, the conventional pattern recognition described above has the following problems.
Even if a modified Bayes discriminant function is used, the recognition accuracy of fonts greatly deformed compared with a Mincho style, which is the most popular font for Japanese, and characters in a document greatly degraded depending on an input/output condition is not good. If greatly deformed fonts are degraded, the recognition accuracy further decreases.