1. Field of the Invention
The present invention relates to a character reader for entering characters into a word processor or the like, and particularly to an optical character reader (hereinafter, referred to as an OCR) for reading and entering characters written on paper or the like by optical methods.
2. Description of the Related Art
As a means of entering characters into a word processor or the like, an OCR is used in some cases to read and enter the characters written on paper or the like automatically, without typing with a keyboard. Input of the characters by means of an OCR involves the steps of reading the characters on paper or the like with a scanner to convert them into image data, analyzing the layout of the image data to discern the character portion and recognizing the characters by the technique of pattern recognition. In entering the characters by means of an OCR, however, if the direction of the image input portion of a scanner tilts on the surface of the paper having the characters written thereon at the time of reading the characters with a scanner, thereby to cause the tilt in the readout image. Accordingly, there have been such problems that the characters can't be correctly discerned by the layout analysis and the performance of recognition is decreased because of recognition processing being performed with the characters tilted.
As countermeasures of these problems, there have been various conventional techniques to improve the recognition performance by detecting the characters tilting on the paper and correcting the detected tilt. In this kind of the conventional technique for use in an OCR, for example, "A Character String Direction Discrimination Device" (Article 1) is disclosed in Japanese Patent Laid-Open No. 61-160180. A character reader described in the same patent, comprises a photoelectric converter for converting characters from analog data to digital data by photoelectric methods so as to deliver the quantum signals, an image data storing unit for storing the delivered quantum signals as image data, a marginal distribution creating unit for requiring a histogram obtained by performing projection as for one region or more within the stored image from several directions and accumulating the density, and a character direction judging unit for judging the direction of the character string on the basis of the created histogram. The character reader requires a histogram of black pixel by the projection performed on the character string from several directions and finds the sharpest portion in the directions, which is recognized as the tilt of the above character string.
As another conventional technique, "A Character Reader" (Article 2) is disclosed in Japanese Unexamined Patent Publication (Kokai) No. Heisei 2-116987. The character reader described in the same patent comprises an input means for entering image data, an extracting means for extracting character string from the entered image data, a character discerning means for discerning each character from the extracted character string, and a reference line extracting means in which assuming a certain straight line passing a specified position like the lower end portion of a circumscribed rectangle of each character having been discerned, a histogram on the parameter space is required as for a set of parameter defining this straight line and the tilt on the straight line which is defined by a set of parameter providing with the maximum frequency on the histogram is regarded as the tilt on the character string. The character reader once scans the whole image data to be entered, requires a histogram on a parameter space by assuming the above specified straight line, after roughly discerning the characters, and recognizes the tilt on the character string on the basis of the required histogram.
As described above, the conventional character reader has a drawback that it takes much time in processing because the processing amount becomes huge in case of recognizing the tilt on the character string in order to improve accuracy of character recognition.
More specifically, the conventional character reader described in the article 1 requires a histogram of black pixel by the projection performed from several directions for all the black pixels in order to improve accuracy, with the result that processing amount becomes huge.
Additionally, since the processing amount for requiring a set of parameter defining a straight line which passes a specified position of a circumscribed rectangle of each character, is proportional to the number of characters, processing amount becomes enormous when there are many characters.