Digital scanners (e.g., flatbed scanners) generally illuminate a document with a uniform intensity of light having a known illumination spectrum from a controlled light source and capture a digital image of the document using a digital sensor having a plurality of sensor pixels. The reflectance over a small area of the document can be determined from the sensed pixel value for a corresponding sensor pixel that captures the light reflected by the small document area. The collection of the sensed pixel values comprises a digitized document.
Many digital scanners provide various image correction and enhancement algorithms for processing the digitized document. One class of algorithms that is commonly applied is background correction wherein the background reflectance of the digitized document is automatically determined and corrected to render it to a specified color (e.g., white).
When utilizing other means of digitizing a document, such as by capturing an image of the document using a digital camera or camera phone, the illumination of the document is often uncontrolled and unknown. Therefore, the reflectance across the document background cannot be easily determined and used to automatically correct the background of the digitized document. Digital cameras and camera phones typically perform automatic white balancing, but often the white balancing is designed to balance indoor or outdoor scenes rather than images of documents, and therefore the automatic white balancing often performs poorly when applied to a digitized document.
Binarization methods that set the background to white and the text or graphics to black are sometimes used to process digitized documents in order to improve the legibility of the document. A digitized document can also be binarized to save digital memory space but this aspect of binarization is not discussed here.
The simplest binarization method, one that is well-known to those skilled in the art, is to employ a global thresholding operation. Any pixel values in the digitized document that are above a specified global threshold are set to white and any pixel values that are below the global threshold are set to black. A drawback of this simple method is that it produces visually unpleasant binarized documents in the presence of illumination non-uniformity and imaging noise. The artifacts that this method can produce include ragged edges and clouds of black dots in areas where the illumination during digitization was darker than other areas of the document. Many methods have been disclosed that increase the complexity of binarization in order to produce visually-pleasing binarized documents.
A binarization method as described by Burian et al. in U.S. Pat. No. 7,636,467, entitled “Binarization of an image,” employs locally-adapted thresholds derived from moving pixel sums followed by corrections using binary median and binary morphological operations. The binary correction steps significantly increase the complexity of this method.
Another binarization method is described in U.S. Pat. No. 6,941,013 to Drayer, entitled “Method of image binarization using histogram modeling,” wherein pixel value histograms are modeled. Each pixel value is classified as being either foreground or background based on the pixel value histogram, and the classified pixel values are quantized accordingly. The histogram modeling step significantly increases the complexity of this method.
U.S. Pat. No. 6,351,566 to Zlotnick, entitled “Method for image binarization,” describes a binarization method that includes optimizing a merit function to find a middle threshold and a pixel value difference parameter responsive to the statistics of the pixel values. A trinarizing operation is applied to the image using the middle threshold and the pixel value difference parameter, and the middle-valued pixels are then binarized in the trinary image to form a binary image. The optimization step significantly increases the complexity of this method.
U.S. Pat. No. 7,057,595 to Benyoub et al., entitled “Image binarization method,” discloses an approach that combines several binarization methods to produce a binarized image. This method is computationally complex because it requires performing a plurality of separate binarization methods in order to produce a binarized image.
One way to keep computational complexity low while producing a visually-pleasing document image is to produce a grayscale document, rather than a binary document, where the background of the grayscale document is essentially white and the foreground is essentially black. The simplest way to produce a grayscale image with an essentially white background and an essentially black foreground is to apply a global tonescaling function to the digitized document that increases the image contrast. A drawback of this simple method is that it produces visually unpleasant documents in the presence of illumination non-uniformity and imaging noise because the method treats foreground and background pixel values the same. The artifacts that the method produces include clouds of dark dots in areas where the illumination during digitization was darker than other areas of the document. Many methods have been disclosed that selectively change the contrast of an image by altering pixel values in a way that is responsive to the image pixel values.
A method described by Lee in commonly-assigned U.S. Pat. No. 5,012,333, entitled “Interactive dynamic range adjustment system for printing digital images,” includes separating an image into a high-frequency image and a low-frequency image by using FIR filters. A tonescale function is applied to only the low-frequency image, and the high-frequency image is added to the tonescaled low-frequency image. Another method to adaptively change the contrast of an image is described in U.S. Pat. No. 5,454,044 to Nakajima, entitled “Apparatus for enhancing image data using a monotonously decreasing function.” According to this approach, the contrast of pixel values within regions that have a high mean pixel value are decreased. Both of these methods alter the pixel values in a manner that is adaptive to the image content in order to preserve the high frequencies in the image, but both methods produce undershoot and overshoot artifacts near high-contrast edges. Moreover, neither of these methods teaches how to produce a document image with a background that is essentially white and a foreground that is essentially black.
Commonly-assigned U.S. Pat. No. 6,317,521 to Gallagher et al., entitled “Method for preserving image detail when adjusting the contrast of a digital image,” describes a method that also includes separating an input image into a high-frequency image and a low-frequency image, but in addition includes an artifact avoidance function to avoid artifacts such as those produced by the aforementioned U.S. Pat. No. 5,012,333 and U.S. Pat. No. 5,454,044. However, a drawback of this method, when utilized to process a document image with a background that is essentially white and a foreground that is essentially black, is that it can produce artifacts. The method utilizes an avoidance function which is active only near edges. Within thin lines or letters the avoidance function may switch from being active (i.e., it is significantly less than one) near an edge of a thin line or letter to being inactive (i.e., it is essentially one) near the middle of the thin line or letter to being active again near the opposite edge of the thin line or letter. When the avoidance function is active, essentially only the low-frequency content image is tonescaled and the high-frequency content image is added without tonescaling. When the avoidance function is inactive, essentially the sum of the low-frequency content (comprising positive pixel values) and the high-frequency content (comprising negative pixel values within the aforementioned regions) is tonescaled. The positive pixel values from the low-frequency content will always be larger than the sum of the positive pixel values from the low-frequency content image plus the negative pixel values from the high-frequency content image in the aforementioned regions. Therefore, for a monotonically increasing tonescale, the regions of the tonescaled image within thin lines or letters where the avoidance function is active will be lighter than similar regions where the avoidance function is inactive. This results in artifacts where regions inside of thin lines or letters are reproduced as gray rather than black.
Commonly-assigned U.S. Pat. No. 7,158,686 to Gindele et al., entitled “Enhancing the tonal characteristics of digital images using inflection points in a tone scale function,” describes a method to improve the tonal characteristics of a digital image which includes adaptively producing a tonescale function having a highlight tonescale segment and a shadow tonescale segment. However, the method teaches how to improve the tonal characteristics of natural scenes and does not teach how to produce a document image with a background that is essentially white and a foreground that is essentially black.
There remains a need for a computationally efficient method to process a digitized document image captured with non-uniform illumination to provide an enhanced image where the background is essentially white and the text and graphics are essentially black.