1. Field of the Invention
This invention relates to a mathematical expression recognizing device and a mathematical expressions recognizing method as well as to a character recognizing device and a character recognizing method that can be used for recognizing a document image containing mathematical expressions.
2. Description of the Related Art
Reports on character recognition for printed documents containing mathematical expressions, or mathematical expressions, and recognition of the structures of mathematical expressions have been produced for some time, although the number of such reports is not very large. The characters to be recognized are not necessarily arranged one-dimensionally. Rather, arrangements of characters to be recognized are more often than not two-dimensional as indexes, exponents, fractional numbers and so on are arranged two-dimensionally in ordinary practice. Therefore, there must be provided means for recognizing (determining) not only the characters included in or relating to mathematical expressions but also the structures (positional information) of mathematical expressions in order to know if each of the characters is placed there as an index, an exponent, a denominator, a numerator or something else. Thus, for recognizing a mathematical expression by means of a computer, the time required for the processing operation will be much longer than the time consumed for processing ordinary characters.
Reports on achievements that have made it possible to recognize the structure of a mathematical exposure within a practical processing time include documents [1], [2] and [3] listed below. According to the documents, a rule is defined for determining the positional relationship of the characters in a mathematical expression including that of upper and lower characters and each of the characters is judged to be an ordinary character, an index, an exponent, a denominator, a numerator or something else according to its position by referring to the rule in order to recognize the structure of the mathematical expression.
Document [1]: Masayuki Okamoto, Hashim Msafiri Twaayondo, “Structure Recognition of Mathematical Expressions Using Peripheral Distribution Features”, Transaction for the Institute of Electronics, Information and Communication, D-II, Vol. J78-D-II, No. 2, pp. 366–370 (1995).
Document [2]: Masayuki Okamoto, Hiroyuki Azuma, “Recognition of Mathematical Expressions with Emphasis on the Layout of Signs”, Transaction for the Institute of Electronics, Information and Communication, D-II, Vol. J78-D-II, No. 3, pp. 474–482 (1995).
Document [3]: R. J. Fateman, T. Tokuyasu, B. P. Berman and N. Mitchell, “Optical Character Recognition and Parsing of Typeset Mathematics”, Journal of Visual Communication and Image Representation, Vol. 7, No. 1, pp. 2–15 (1995).
However, with the prior art including the known techniques of the above listed documents, each character is judged to be an ordinary character, an index, an exponent, a denominator, a numerator or something else based on the local characteristic. Therefore, if the position of one character is misjudged, it adversely affects all the subsequent judgments to a significant extent. For example, if an ordinary character is misjudged to be an index, all the ordinary characters arranged after it on the level same as that of the misjudged character are misjudged to be so many indexes. In short, a local misrecognition of a mathematical expression can greatly damage the recognition of the entire structure thereof.
Additionally, the known techniques of the above listed documents only relate to character recognition within a mathematical expression and do not show any technique to detect a mathematical expression in a text.