1. Field of the Invention
The present invention relates to a technology for recognizing a character overlapped with a pattern.
2. Description of the Related Art
In an electronic document such as a spreadsheet or checklist, a letter or character image sometimes overlaps or touches a line in the electronic document. In such a case, it is necessary to recognize the character image, and for example, Japanese Patent No. 2871590 discloses a technology for this purpose. In general, character recognition is performed in the electronic document by removing a line from a character image in which one or more characters overlap or touch the line, and compensating the removed portion.
With the conventional technology, however, it is difficult to compensate the removed portion because there are various manners in which a character image touches a line, and also a character image includes not only an image of alphabet letters and numbers that have simple shapes but also characters that have complicated shapes such as Chinese characters (see FIG. 15, for example).
For this reason, for example, Japanese Patent No. 3455649 discloses a technology for recognizing a character image in a document. According to the conventional technology, features of a line-touching character image is stored as a dictionary in advance, and character recognition is performed based on the dictionary without removing a line from the line-touching character image.
In the conventional technology, however, where a character image is recognized without removing a line from a line-touching character image, a large memory capacity is required for the dictionary. Specifically, information stored as the dictionary includes types of lines, such as a bold line, a thin line, a broken line and a dashed-dotted line, and possible crossing positions of a line and a character image, such as the top, bottom, left, right and inside of the character, which increases the size of the dictionary to be stored.