The present invention relates to an apparatus for recognizing characters, and particularly to an improvement in recognition accuracy thereof.
One of conventional apparatus for recognizing characters of this type can be represented by a template matching system. This apparatus converts characters printed or typewritten on a paper into electric binary video signals through a photoelectric converter element, separates successive characters from one another, normalizes them so that each character will have a predetermined size, compares and collates them with a reference dictionary which stores reference figures, selects from the dictionary a most similar category of the figure to the normalized character, and regards the category as a result of character recognition.
According to this apparatus for recognizing characters, however, characters having small widths, such as "l", "i", "j", or "r" and like characters are not normalized in the direction of width. This is because, if these narrow width characters are normalized, they may be recognized as a category "l". Therefore, when characters having small widths in addition to the above-mentioned characters "l", "i", "j", "r" are input, they are not normalized and may be rejected or may be erroneously recognized. These inconveniences will now be described in conjunction with the drawings.
FIG. 1A shows examples of input figures (numerals 1, 2, 3) in meshes. Prior to recognizing the characters (figures), the input fingers are cut out one character by one character, and the height and width are detected for each of the characters. In FIG. 1A, the numeral "1" has a height of 21 mesh and a width of 6 mesh, the numeral "2" has a height of 21 mesh and a width of 12 mesh, and the numeral "3" has a height of 22 mesh and a width of 8 mesh. Means for cutting out the characters or for detecting the shapes of characters are widely known in the art and are not described here. FIG. 1B shows the projection result employed for detecting the width, i.e., it shows figures in which logical products are found in the vertical direction.
If the input characters having widths of as small as 10 mesh or less are not normalized in the lateral direction but are normalized in the direction of height only, the figures of the normalized characters are shown in FIGS. 2A, 2B and 2C. The figures after normalization have a size of 16.times.16 mesh.
FIGS. 3A to 3F illustrate examples of references stored in the reference dictionary in the template matching system. FIGS. 3A and 3B illustrate reference figures of the numeral "1", wherein marks "X" in FIG. 3A denote points that should be black and marks "X" in FIG. 3B denote points that should be white. Similarly, FIGS. 3C, 3D and 3E, 3F illustrate reference figures of the numerals "2" and "3", respectively.
Here, if FIGS. 2 and 3 are compared and collated, the non-coincident numbers in "1" and "2" are 1 and 0, respectively. For "3", however, the non-coincident number is 30 or more. Namely, though "1" and "2" are properly recognized, the reading of "3" is rejected.