This patent application is a continuation of the U.S. patent application Ser. No. 10/224,660, filed Aug. 20, 2002, entitled “Systems and Methods for Content-Based Document Image Enhancement” and assigned to the assignee of the present application.
The present invention is directed to systems and methods for enhancing images of documents. More specifically, without limitation, the present invention relates to systems and methods for enhancing images of documents based upon the content of such documents.
Digital copying, in which a digital image is obtained from a scanning device and then printed, involves a variety of inherent factors that compromise image quality. Ordered halftone patterns in the original document interact with the periodic sampling of the scanner, producing objectionable moiré patterns. These are exacerbated when the copy is reprinted with an ordered halftone pattern. In addition, limited scan resolution blurs edges, degrading the appearance of detail such as text. Fine detail also suffers from flare, caused by the reflection and scattering of light from the scanner's illumination source. Flare blends together nearby colors, blurring the high-frequency content of the document.
To suppress moiré, a filter may be constructed that is customized to the frequencies of interest. However, both the detection of the input halftone frequencies and the frequency-domain filtering itself can require significant computational effort. Although crude, a simple, small low-pass filter can correct the majority of moiré artifacts. Unfortunately, low-pass filtering affects detail as well, blurring it even further. Sharpening improves the appearance of text and fine detail, countering the effects of limited scan resolution and flare. Edges become clear and distinct. Of course, other artifacts such as noise and moiré become sharper as well.
The solution is simple in concept: determine the content of regions within the scanned image and then apply the appropriate filter to each region. Sharpening should be performed on fine detail, while moiré suppression should be applied to certain periodic artifacts. From the above discussion, therefore, for an image enhancement system to work properly, a preprocessing step should include the segmentation of the document into text and halftoned images, as well as identification of background. If this step is successfully completed, selection and/or application of appropriate additional processing such as filtering, interpolation, optical character recognition or transformation can occur.
Several techniques have been used to segment documents into text, images and background. These techniques have been primarily designed for optical character recognition (OCR). In these techniques, generally, the document is divided into columns. The columns are then separated into rectangular connected regions. Regions that are small are considered to be text, while large regions are treated as images. These techniques, however, require large portions of the document to be saved in memory and also require intensive computations, which render them impractical for real-time processing.
For enhancement purposes, a simpler and faster way to differentiate between text and image regions in scanned documents is to extract edge information. In general, a higher magnitude of edges would suggest high contrast between a pixel and its neighbors. This is usually an indication of the presence of a text element. Using a predefined threshold, a simple classifier can be constructed:                1. If edge values are higher than a certain threshold then pixels are classified as text; otherwise they are classified as images.        2. Text pixels are sharpened while image pixels are smoothed.        
This technique, however, has several disadvantages. First, the algorithm, although simple, does not meet real-time computational constraints. Next, selecting an edge threshold low enough to sharpen all text will sharpen other features as well (resulting from misclassifying images as text), Finally, increasing the value of the threshold will cause parts of the text (especially fine strokes) to be misclassified and potentially blurred.
These and other disadvantages of known techniques are solved in one embodiment of the present invention by including spatial constraints with the edge information. Edge thresholds are set high enough to ensure smooth images, and spatial information ensures that fine text is sharpened. The output of this operation significantly improves the quality of the scanned document.