A. Technical Field
The present invention relates generally to the enhancement of a scanned document, and more particularly to the use of content classification labels to enhance, sharpen, or blur content within such document, and/or provide color compensation. The content labels can be exploited for subsequent high-quality rendering on a printer or other output device and/or for compressed storage.
B. Background of the Invention
A system for the digital reproduction of a document has to contend with a number of sources of error that are introduced as a result of scanning and printing operations. These errors may degrade copy quality via varying degrees of blurred edges, Moire patterns, color shifts, sensor noise, etc. The scanning errors may be caused by a number of different factors including the quality of the scanning device, the color and quality of the document, and the content and complexity of the document. Other image artifacts found in a copied document can be attributed to the printing process that was used to generate the original page (e.g., halftone and screening noise) that is being copied.
FIG. 1 illustrates an exemplary document, having different content regions, which may be copied. As shown, the document 100 has a first region 110 containing a first image, a second region 130 having text, a third region 135 containing an image 120 and text 140 integrated within the image 120, and a fourth region 160 representing the background of the document 100. Image artifacts within these regions may be generated during the scanning process of the document and handled differently depending on the content region in which the artifact is located. One example of such an image artifact is a Moire pattern, which is low frequency noise that is generated during the scanning of pages with halftones. The Moire patterns are generated by frequency aliasing between halftones within the document and the pixel grid of the scanning device. In addition to Moire patterns, other types of artifacts such as blurring may be introduced into the scanned document.
The identification of these artifacts, including image features such as halftone edges and Moire patterns, and their subsequent processing depends on the locations of different types of image regions within the document. In particular, an appropriate method for identifying and removing unwanted image artifacts may depend on whether the artifact resides in (1) an image, such as the image in the first region 110, (2) text, such as the text in the second region 130, (3) mixed text and image, such as the third region 135, or (4) the background 160.
There are sharpening and masking tools being sold on the market that enhance document copies with varying degrees of effectiveness. However, these sharpening tools may be unable to adapt effectively to the different types of image content in the document copies to compensate for different types of artifacts found within different image regions. For example, text within halftone (such as the text 140 shown in the third region 135) presents a difficult problem because edges arising out of the halftone pattern need to be suppressed while edges corresponding to text boundaries may need to be emphasized. Appropriate handling of image enhancement in this case depends on identifying the type of edge and applying region-appropriate processing.
The reproduction of an original document that has been created on colored paper may also present certain issues. Even in the ideal case, when a copier accurately reproduces all the colors in the original document, the result may not be acceptable to the end user. For example, if the original consists of a document printed on slightly yellow paper, the user may prefer to see the copy with a white background (i.e., in this case, the user prefers that the color of the paper on which the original is printed is not accurately reproduced). Similarly, if the original is printed on strongly colored paper stock, the user may prefer not to reproduce the page background color. A typical example wanting a different background is a situation in which a user has a document printed on red paper (say, a red flier), but would like the output printed on yellow paper (to make a yellow flier).
Scanned digital documents may be subsequently processed, parsed, displayed, or printed. The tools and methods required to perform these operations often depend on the type content that is being processed. For example, images may be encoded and stored using different encoding devices and algorithms as compared to those employed during the encoding of text. Furthermore, the requirements on a display device to show a color image versus text may be much different. When printed, text may be rendered with a higher frequency screen for improved rendering of spatial features, while images are better rendered with a low frequency screen to provide smoother color gradation.
Accordingly, what is needed are systems and methods for labeling content within a scanned document and enhancing the document based on the labeled content for subsequent high quality rendering.