1. Technical Field
The invention is related to optical character recognition (OCR) systems and in particular to image pre-processing with template matching to enhance or strengthen weak or poor quality images of printed or hand-written characters.
2. Background Art
In optical character recognition systems, scanning a document using a scanner resolution of about 400 dots per inch typically yields bit maps of character images which have consistently strong character strokes of at least two pixels in width. Such bit map images are readily processed by the optical character recognition system. The document may be scanned more quickly if the scanning resolution is reduced to 200 dots per inch. However, it has been found that at this reduced resolution, the quality of the character image bit map is reduced because many character strokes are found having only a one-pixel width. Such thin character strokes in the presence of thicker character strokes often prevent the optical character recognition system from recognizing the character image.
In the invention, poor quality character image bit maps, such as those obtained by scanning with a resolution of only 200 dots per inch, are pre-processed using template matching to detect and thicken any weak or thin character strokes. The enhanced character image bit maps thus produced by the pre-processor of the invention are readily recognized by the optical character recognition system.
Pre-processing a document image with template matching for optical character recognition is well-known in the art. For example, U.S. Pat. No. 3,624,606 (Lefevre) discloses the concept of matching the character bit map with a set of templates, each template containing a stripe corresponding to a horizontal line image, a vertical line image, a major diagonal line image and a minor diagonal line image. However, this reference does not disclose thickening a matching character stroke in the image. Instead, it teaches matching both black and white striped templates with the image for the purpose of lengthening any matching stripe in the bit map image, be it black or white (i.e., whether it is a character stroke or a space between adjacent character strokes). Thus, this reference appears to have nothing to do with the thickening or strengthening of character strokes.
U.S. Pat. No. 4,791,679 (Barski et al.) discloses an optical character recognition system which strengthens the strokes of individual characters as needed by grouping the pixels in a grid of squares and converting various ones of the squares or diagonal halves thereof to all binary ones in accordance with a set of rules. The rules essentially respond to a correlation between each kernel of the image and vertical, horizontal or diagonal lines. The disadvantage of this approach with respect to that of the present invention is that the placement of the grid in the image is arbitrary, so that the strengthened strokes will not necessarily be centered with respect to original character strokes. In contrast, the present invention automatically centers each strengthened character stroke with respect to the original character stroke.
U.S. Pat. No. 4,124,870 (Schatz et al.) and U.S. Pat. No. 3,573,789 (Sharp) both disclose systems for increasing the resolution of an image by techniques related to interpolation between the pixels of the original coarse resolution image to generate the additional pixels required for the processed fine resolution image.
In summary, there appears to be no technique in the art for strengthening--thickening--a weak character stroke in such a manner that the thickened character stroke is automatically centered with respect to the original weak character stroke. In fact, it would appear that the known techniques are liable to create a thickened character stroke in lieu of a weak stroke which, at least in some cases, is slightly off-center (or laterally displaced) with respect to other strokes in the image which did not need strengthening and which were therefore unchanged.
It is therefore an object of the invention to strengthen a weak character stroke in a bit-map image of an unknown character so that the resulting thicker character stroke is automatically centered with respect to the original weak character stroke.