1. Field of the Invention
This invention relates to method and apparatus for binarizing scanned document images, and in particular, it relates to a method and apparatus for binarizing scanned document images that contain gray or light colored text printed with halftone patterns.
2. Description of Related Art
With the development of computer technology and the Internet, electronic documents are becoming more and more popular because of its advantages over paper based documents, such as easy storage, easy search and retrieve, fast transmission, and environmental friendliness. In the past, paper based documents have dominated for a long time and a large amount of paper based documents have been generated over the years. A paper based document can be converted to an electronic document using a scanner. For documents that contain text, it is further desirable to convert the scanned document images into text for text searching and other purposes.
Automatic document analysis systems have been developed to convert scanned document images into searchable electronic documents. Such a system typically includes three major components, namely a binarization component, a segmentation component, and an optical character recognition (OCR) component. The first component, binarization, separates the foreground (text, picture, line drawing, etc.) from the background. It converts a color or gray-scale image into a binary image where each pixel has a value of zero or one. Binarization is an important step because the subsequent segmentation and recognition components rely on high quality binarized images. Good binarization results not only can decrease the computational load and simplify the subsequent analysis, but also can improve the overall performance of the automatic document analysis system.
In conventional methods, binarization is typically performed either globally or locally. Global binarization methods use one calculated threshold value for the entire scanned image to convert multi-bit pixel values into binary pixel values. Pixel values above the threshold value are converted to 1 (or 0) and pixel values below the threshold value are converted to 0 (or 1). Local binarization methods use adapted statistical values calculated from local areas as threshold values for binarization of the local areas.
Examples of global binarization methods can be found in N. Otsu, “A Threshold Selection Method from Gray-Level Histograms,” IEEE Transactions on Systems, Man, and Cybernetics, Vol. 9, No. 1, 1979, pp. 62-66 (hereinafter “Otsu”); A. Rosenfield, R. C. Smith, “Thresholding using Relaxation”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 3, No. 5, 1981, pp. 598-606; and V. A. Shapiro, P. K. Veleva, V. S. Sgurev, “An Adaptive Method for Image Thresholding”, Proceedings of the 11th IAPR International Conference on Pattern Recognition, 1992, pp. 696-699. Examples of local binarization methods can be found in W. Niblack, “An introduction to Image Processing”, Prentice-Hall, Englewood Cliffs, 1986, pp. 115-116; J. Sauvola, M. Pietikainen, “Adaptive document image binarization”, Pattern Recognition, Vol. 33, 2000, pp. 225-236 (hereinafter “Sauvola et al.”); and I. Kim, D. Jung, R. Park, Document image binarization based on topographic analysis using a water flow model, Pattern Recognition Vol. 35, 2002, pp. 265-277.