1. Field of the Invention
The present invention is a system and method for optical character recognition (OCR) which uses normalization of the character sizes. The normalization of the character sizes is performed, in pre-processing for character recognition, by converting characters having a variety of sizes into characters having a determined size in order to improve the character recognition rate.
2. Prior Art
For character recognition (particularly, pattern matching), some method is employed to normalize characters. When all characters are merely fully normalized into a predefined result buffer, however, a period ".", the numeral one "1" and a hyphen ".sub.-- " all appear as black blocks. These specific characters, which are written especially small, must be smaller than the other common characters after normalization is performed. Further, when the aspect ratio (a value obtained by dividing a character height by a character width) is maintained constant, the same type of characters may be difficult to match depending on whether they are written bold or thin. When characters are normalized in a center location, an underscore ".sub.-- " and a hyphen ".sub.-- " are indistinguishable.
According to a conventional normalization method, when the height or the width of a character exceeds a specific threshold value, the longer axis of the character is adjusted to the maximum value of the result buffer, and the other axis is so normalized in the center of the buffer as to maintain the aspect ratio. Using this method, a difference in the aspect ratios of even like characters causes the mismatching of patterns, so that even like characters tend to be recognized as different characters, or different characters having the same shape but being positioned at different locations tend to be recognized as being identical character types.
In order to recognize characters ranging from normal sizes to small sizes, a "character recognition system" was proposed in Japanese Patent No. 1817562 owned by International Business Machines Corporation. A portion of that system is illustrated by the Figures in this application.