Text extraction is a very important step for many applications, such as Optical Character Recognition (OCR), text-based video retrieval, document image compression, etc. Most of current techniques aim to extract text from images with simple background. In recent years, the technique of extracting text from complex images is required in more and more fields, such as complex document analysis, engineer drawings analysis, etc. However, it is a very difficult problem to extract text from document image with complex background. Many methods have been proposed by prior researchers, most of them are only effective for simple or not so complex images.
Current text extraction methods can be classified into two groups: Color-clustering based methods and edge-analysis based methods.
Color-clustering based methods assume that text has homogeneous foreground colors. However, this is not always the case, especially for small characters. For example, text characters may be printed with different colors; images may be captured under uneven illumination conditions. And for small texts, the foreground colors are always not uniform, because the transition region is too wide in comparison with stroke width. Accordingly, it is hard to acquire proper global binarizing threshold for the whole image and thus it is impossible to eliminate all the light-colored background with un-uniform colors.
On the other hand, the edge-analysis based methods assume that text has big contrast with background. But in images with complex background, non-text objects maybe have big contrast with background, which will cause the text edges and non-text edges touch with each other after edge detection processing. This often brings difficulty or unstable results for edge analysis.
For example, the Japanese Patent Application Laid-open No. JP-A-2000-20714 has disclosed an image processing method, its device and recording medium storing image processing function.
FIG. 10 shows the flow chart of the image processing method disclosed by the above Japanese Patent Application Laid-open No. JP-A-2000-20714.
To obtain a binary image having no noise interrupting recognition even on a background image, the density image of an original image to be threshold processed is inputted in step S101 and stored in step S102. Then, in step S103, a certain pixel is noticed and whether the pixel is the edge of a character or a ruled line or not is judged. Thereafter, in step S104, The pixel value on a binary image of the pixel judged as the edge is determined and stored. These operations are repeated for all pixels on the original image in step S105 and all connection components of pixels other than edges are found out in step S106. Then, in step S107, pixels brought into contact with the periphery of a certain connection component and having already determined pixel values are noticed and the numbers of black pixels and white pixels are respectively counted. The numbers of black and white pixels are mutually compared in step S108, and when the number of black pixels is larger, the whole connection component is registered as black pixels in step S110. In the other case, the whole connection component is registered as white pixels in step S109. The operation is repeated for all connection components in step S111, and finally a binary image is generated in step S112 and outputted in step S113.
According to the above described method, the long lines formed by the Connected Components appearing in the background can be recognized and removed from the binarized edge map. However, in the edge map after binarizing, closed text row also may form a long Connected Component. In this case, it is not easy to separate the text from the closed text row and the whole closed text row may be deemed as background and be ignored according to the above disclosed method. Whereas the text row is what is desired and should not be simply removed. Therefore, if the scanned document image with complex background is binarized and processed according to the above mentioned prior art, useful text may be lost.