1. Field of the Invention
The present invention relates to a method, an apparatus and a storage medium for enhancing document image, and a method, an apparatus and a storage medium for character recognition using the same.
2. Related Art
OCR is a well-known technique in recognizing either hand-written characters or scanned characters.
As shown in FIG. 1, to implement character recognition of a document image 102, a step of block segmentation 104 is carried out to separate the smallest region containing all the characters from the whole binary document image 102. In other words, the step of block segmentation 104 is to remove the margin of the document image 102. The resultant block image, which generally is a rectangle region, is further processed by a step of line segmentation 108, so that each character line in form of line image is extracted. Then each line image is subjected to a step of character segmentation 112 and character images corresponding to each character to be recognized are extracted. The last step is a step of single-character recognizing 116 based on each character image, and the recognition result 118 is output to, for example, a text processing application or the like.
When recognizing scanned document image, if the image quality is high, the recognition result of current OCR products is satisfying. However, if the quality of the document image is not so perfect or even is very bad, then the recognition ratio will sharply decrease.
For example, conventional OCR engine could not recognize color or gray-scale image very well. This is because OCR is based on binary image recognition. For purpose of scanning, storing and recognizing color or gray scale image original in the format of binary image, half-tone image technique has been developed. In a half-tone image, one “pixel” is comprised of a small binary image so that different colors or grays could be simulated. The so-called “pixel” actually includes an array of binary pixels and corresponds to a small area having a certain color or gray level in the original. For that reason, compared to normal binary document image, or compared to the original, the quality of a half-tone document image is much lower.
That is to say, conventional OCR engine could not recognize color or gray-scale image very well because it could not recognize half-tone image very well. The specific reasons are as follow.
When the original has a background having a certain color or gray level, then in the half-tone document image to be recognized, there are many background noises caused by said color or gray level, as shown in FIG. 2.
As to the characters, if they are not black in the original, then in the strokes of the character in the half-tone document image, the corresponding pixels will not be all in black, but some white pixels will appear. Then the strokes will look like broken (as shown in FIG. 3), hollowed (as shown in FIG. 4) or having zigzag contours (as shown in FIG. 5) under different conditions.
Obviously, the broken strokes, hollowed strokes and zigzag strokes will strongly distort the extracted features of character images. Conventional OCR algorithm cannot distinguish the different defects as described above of half-tone document images and cannot make the corresponding restoration, consequently the recognition ratio is very low.
The noise will also greatly affect the block segmentation, line segmentation, character segmentation and single-character recognizing. If noise reduction is carried out, then the phenomena of broken strokes, hollowed strokes and zigzag strokes will be much severer. Under such conditions, conventional OCR algorithm even cannot make right line segmentation. This is because conventional OCR algorithm is directed to normal document image, which has much less noises, so the noise reduction carried in conventional OCR algorithm is very soft. Even for normal document image, if strong noise reduction is carried out, the strokes will be affected and the recognition ratio will decrease.
In addition, there are other applications, such as copying apparatus, needing enhancing document images such as obtained from non-white-and-black originals.