The present invention generally relates to recognition of characters and more particularly to a method for recognizing alphanumeric characters printed or hand-written on a sheet.
Conventionally, recognition of alphanumeric characters printed on a sheet is made by extracting a feature of the characters which changes little with respect to the statistical distribution or with respect to the moment of the character. For example, Kahan et al. describes a method of character recognition using the statistical distribution of the skeleton line of a character (Kahan, S., Pavlidis, T., Baird H. S. IEEE Transactions on Pattern Analysis and Machine Intelligence vol.PAMI-9, pp.274-288, March 1987). On the other hand, Cash et al. describes a method of character recognition using the moment (Cash, G. L. and Hatamian, M. Computer Vision, Graphics and Image Processing vol.39, pp.291-310, 1987).
In any of the foregoing methods, the recognition of characters is made on the basis of the feature extracted from the character, after segmenting the characters read from the sheet into individual characters. In the actual documents, there occurs rather frequently a case in which adjacent characters contact or overlap each other. Therefore, the foregoing methods are often ineffective for recognizing the characters on the actual document. In order to handle such a case, empirical processes had to be used. For example, Kahan et al. proposes to distinguish the processes of recognizing the characters in the case of the so-called "serif-join" wherein the characters contact each other at the end of the serif and in the case of the so-called "double-join" wherein convex contours of the characters overlap each other. According to this process, when it is recognized that there exists a character image which does not belong to any of the characters after the process of the first segmentation, a perspective histogram analysis of the character image is applied for segmentation of the character. However, this process is generally not successful for the actual documents, and thus, there exists a serious problem with respect to the segmentation of alphanumeric characters.
In order to eliminate this problem of segmentation, Japanese Patent Publication No. 58-47064 proposes a method wherein a character string is scanned by a slit from the left to extract the feature of the character. In this method, a self-correlation analysis is applied to the feature thus extracted in order to recognize the character. However, this approach has a problem which is pertinent to the self-correlation analysis in that detailed information on the character tends to be lost.