Techniques are known for removing noise from digital representations of data images obtained by digitally scanning documents and the like. The scanned documents are processed to identify objects within the scanned images that are, in turn, used to mask out the noise. For example, U.S. Pat. No. 7,016,536 discloses a method and apparatus for removing noise by building objects from reduced resolution representations of the scanned image and including the identified objects in a mask that is logically ANDed with the de-skewed representation of the scanned document. Objects identified as picture objects are included in a mask and logically ANDed with the de-skewed representation to eliminate all other objects, while objects marked as data objects are added to the representation to provide a de-skewed, de-speckled representation of the scanned document.
Binarization of an image involves translating grayscale values, typically 0 to 255, into binary values, 0 or 1. A common way to accomplish this mapping is to pick a threshold whereby all values under the threshold are mapped to 0 and all values above the threshold are mapped to a 1. In images with little noise, the quality of the binarization does not depend on the threshold values in that there is a wide range of thresholds that can binarize the images with satisfying results. On the other hand, images with lots of noise are very sensitive to the threshold value and improved techniques are required for setting the binarization threshold for noisy images.
For example, gray scale documents may differ significantly in contrast, intensity, noise levels, and uniformity. As result, different methods have been proposed for selecting a threshold that is appropriate for binarization of such input gray scale images. The histogram may be examined to determine a suitable threshold. For example, the threshold may be set between the two largest peaks in a histogram.
As noted by Drayer in U.S. Pat. No. 6,941,013, one way to determine a suitable threshold is to determine a single global threshold for the entire image as taught, for example, by Otsu in “A Threshold Selection Technique from Grey-Scale Histograms,” IEEE Trans. Systems, Man, and Cybernetics, Vol. 9, No. 1 (1979). However, as noted by Drayer, such global thresholding methods frequently result in loss or confusion of the information contained in a gray scale image due to variations in background intensity across the global image. Depending on the choice of threshold, a meaningful edge in the gray-level image will disappear in the binary image if the pixels on both sides of the edge are binarized to the same value. On the other hand, artifacts in the binary image with the appearance of edge may occur in an area of continuous transition in the gray-level image when pixels with similar gray-level values fall on opposite sides of the selected threshold.
A variation of this technique is to allow the threshold to vary as the image changes. For example, a new threshold may be computed for different sub-regions of the image. Bemsen describes such a method in “Dynamic Thresholding of Grey-Level Images,” Proc. Eighth Int'l Conf. Pattern Recognition (1986), where the calculated pixel value defines a maximum tolerance on the variation in pixel values, thus indicating the presence of foreground. Otherwise, the threshold is set to the minimum to assign all input pixels to the value for the background.
Niblack describes another method in “An Introduction to Digital Image Processing” (1986) in which the mean, μ, and standard deviation, σ, of the pixel values with a subregion of the image are calculated. A threshold value is computed as T=μ+kσ for values of K=−0.2 and a subregion size of 15×15 pixels.
Other methods, such as those disclosed by Chow et al. in “Automatic Detection of the Left Ventricle from Cineangiograms,” Computers and Biomedical Research, Vol. 5 (1972), use statistical measures to determine a local or global threshold to be used for a narrowly defined two-class classification method. However, images with complicated backgrounds or images with a different relative proportion of background and foreground than expected present a challenge for such techniques. Also, pixels at the borders of characters correspond to regions of both foreground and background and present a particular challenge.
A number of standard binarization methods are described by Trier et al. in “Goal-Directed Evaluation of Binarization Methods,” IEEE 1995, each with different strengths and weaknesses. The present invention is designed to improve upon these techniques by enabling the user to select the threshold that appears to yield the best results and then determining the binarization threshold based on statistics derived from the histogram of the threshold adjusted images.