The present invention relates to an optical character recognition apparatus which optically scans characters such as a letter and a mark written on a mail article and performs character recognition based on a scanned signal.
In an optical character recognition apparatus, a character is recognized in such a manner that the character is scanned optically and a binary signal is produced from the scanned signal by slicing the scanned signal by a threshold level. Then, distinctive features of the character are extracted from the binary signal and are compared with reference features. In the binary signal processing, optimization of setting of the threshold level is conducted to improve a recognition rate. Further, to this end, filtering processes such as thinning, gradating and emphasizing processes are also conducted to produce a effective binary signal.
In the case of a mail article as an object to be read, however, a character is frequently written or printed in various conditions, for example, in a variety of brightness and color of a background and in a variety of printed condition of strong to weak. In the case of the mail article, in addition, when a paper constructing the mail article is thin, another character and mark printed in an interior are sometimes picked up by a scanner through such thin paper. Therefore, it is difficult in this case to set an optimum threshold level and to perform an optimum filtering process irrespective of a various object to be read, and thus, the recognition rate can not be sufficiently improved. Namely, when the threshold level and the filtering process are set so as to sufficiently read a weakly printed/written character, for instance, the recognition rate becomes lower since a pattern other than the character is picked up and it becomes a noise in the binary signal. To the contrary, when the threshold level and the filtering process are set so as not to extract the pattern other than the character, a weakly printed written character can not be recognized. As above stated, it is difficult heretofore to increase the recognition rate sufficiently when the mail article having various printing conditions is scanned.