The present invention relates generally to systems and methods for scanning and processing documents. More specifically, the present invention relates to a method for scanning documents that identifies documents scanned with improper image characteristics.
In a conventional digital reproduction device, a document or image is scanned by a digital scanner which converts the light reflected from the document into electrical charges representing the light intensity from predetermined areas (pixels) of the document. The pixels of image data are processed by an image processing system which converts the pixels of image data into signals which can be utilized by the digital reproduction machine to recreate the scanned image. In other words, the image processing system provides the transfer function between the light reflected from the document to the mark on the recording medium.
One measure of the performance of a reproduction machine is how well the copy matches the original. Copy quality can be measured in many different ways. One way is to look at the characteristics of the reproduced image. An example of such a characteristic for determining the quality of the reproduced image is the contrast of the image. The contrast of an imaged (copied) document is one of the most commonly used characteristics for measuring quality since contrast provides a good overall assessment of the image""s quality.
To assure high quality at the output printing device, it is desirable to know the contrast of the image being scanned prior to the image processing stage because, with this knowledge, the image processing system can process the image data so that the reproduced image has the proper contrast. Background detection processes provide one way of obtaining this contrast information prior to digital image processing.
Conventional automatic background detection processes generate a histogram of the document using standard methods, identify a background peak from the histogram and then calculate the mean and standard deviation. The standard deviation is then used to determine the gain factor for the document. The gain factor is used to estimate the background gray level of the image of the scanned document. The detected background can be removed by adjusting the gain of the scanned image and clipping the values that exceeds the system processing range.
Conventionally, background detection is performed by sampling pixel values either within a sub-region of the document (typically the leading edge) or across the whole document (page). Background detection based on leading edge generally provides superior throughput and system productivity because the background detection and background suppression can take place in a single pass. However, the image quality can suffer if the leading edge does not accurately reflect the average background for the entire document.
On the other hand, background detection based on data accumulated from pixel across the entire page provides a more accurate and robust determination of the background level. However, this process for background detection generally suffers form lower throughput rate as it requires two passes through the image data or an electronic memory to store the full image. That is, the process requires two scans, a first to collect data to determine the background level and a second to acquire image data taking into account background suppression Alternatively, background detection and data acquisition can be accomplished in a single pass of the scanning system with a second pass through the data to perform background suppression.
Therefore, it is desirable to utilize a system and method for scanning documents that maintains the productivity and throughput performance of xe2x80x9csingle passxe2x80x9d systems and the robust image quality of xe2x80x9ctwo-passxe2x80x9d systems.
In accordance with one aspect of the present invention, there is provided a method for scanning a document. The method includes (a) acquiring scanned image data from a first region of the document; (b) determining an initial estimate of a document attribute using selected pixels from within the first region of the document; (c) acquiring scanned image data from a second region of the document; (d) processing pixels within the second region of the image in accordance with the initial estimate of the document attribute; (e) determining a second estimate of the document attribute using selected pixels from within the second region of the document; and (f) determining if the initial estimate is valid, and if not processing pixels within the image in accordance with the second estimate of the image characteristic