This invention relates to a method for character reading which permits character recognition to be carried out continuously without necessitating any segmentation of characters being read and to an apparatus used for practicing this method.
In the art of character recognition, a good many methods have been proposed. They require a common processing operation. This processing operation consists in segmenting a multiplicity of characters manually written or mechanically printed in a string into individual characters. Generally for the reading of characters, one prerequisite is a preprocessing which comprises positioning characters to be read and segmenting the string of characters into the individual characters in preparation for character recognition.
In the established art of optical character recognition (hereinafter abbreviated as OCR), the operation of character segmentation is accepted as an indispensable step of processing. By this reason, typical OCR forms used for recognition of manually written characters invariably have frames printed in fixed positions, expecting their users to heed the rule that the characters written on the forms should be perfectly contained within the respective frames. Even for the printed characters, conventional OCR specifies the shapes, the sizes of the characters and symbols used in printing and also a printing pitch such as 10 characters per inch.
The only method so far developed to provide the operation of character recognition without necessitating the aforementioned preprocessing of character segmentation is that which makes use of standard masks prepared for all the character categories and effects desired character recognition by continuously correlating matching the input character being successively brought into a fixed character frame with the standard masks. This method necessitates the calculation of two dimensional correlation each time one input character is brought into the character frame and requires the two dimensional correlation to be carried out with respect to all the character categories on each character being correlated. In view of the volume of calculation involved in the correlation and the speed of calculation required, the actual practice of this method would entail a huge operation. Even when input characters are printed ones, characters of one same size and category are not totally free of pattern variation due to difference in the font used. To absorb this particular pattern variation, the number of standard masks prepared for each character category must be amply increased, requiring a proportional increase in the work of mask matching. Thus, many difficult problems stand in the way of practical use of this method.
Even if there exists a method which obviates the aforementioned simple principle of correlation and permits exact extraction of important geometrical features indicative of character shapes, the common underlying problems of OCR remain yet to be solved because characters given in a string are not always segmented with required accuracy.
In any event, the extraction of features of characters is desired to be effected by a simple method which enjoys as much freedom from the segmentation problem as possible. This leads to the conclusion that additive feature extraction is ideal for the purpose. Unlike the method which effects the character recognition by the steps of segmenting a given character pattern, extracting the entire real image of that character in the form of geometrical features and matching the features with standard masks throughout the entire area involved, the additive feature extraction represents a method which, by means of one scanning column for example, analyzes character patterns being continuously passed through that scanning column and accumulating and updating the resultant character information for each of the columns. The two methods are now compared briefly from the standpoint of processing devices. The former method which resorts to extraction of features in the entire area necessitates service of a memory capable of storing the character information from all the scanning columns involved, whereas the latter method functions effectively with an accumulating counter capable of storing the character information from only one scanning column. Thus, the latter method entails decisively less redundancy than the former method. In addition, since the former method relies for its operation upon entire matching of characters at one time, the probability of recognition being obstructed as by a slight deformation, defect, noise, etc. in a given character is greater for this method than for the other method.
An object of this invention is to provide a method for character reading capable of effecting extraction of character features continuously at a high speed with high reliability without necessitating any processing for character segmentation, and to an apparatus for practicing the method described above.