The present invention relates generally to automatic background detection of a scanned document. More particularly, this invention relates to a process for identifying the background value of a scanned image that separates gray levels of non-document areas from those of the document.
In a conventional digital reproduction device, a document or image is scanned by a digital scanner which converts the light reflected from the document into electrical charges representing the light intensity from predetermined areas (pixels) of the document. The pixels of image data are processed by an image processing system which converts the pixels of image data into signals which can be utilized by the digital reproduction machine to recreate the scanned image. In other words, the image processing system provides the transfer function between the light reflected from the document to the mark on the recording medium.
One measure of the performance of a reproduction machine is how well the copy matches the original. Copy quality can be measured in a variety of different ways. One way is to look at the characteristics of the reproduced image. An example of such a characteristic for determining the quality of the reproduced image is the contrast of the image. The contrast of an imaged (copied) document is one of the most commonly used characteristics for measuring quality since contrast provides a good overall assessment of the image""s quality. To assure high quality at the output printing device, it is desirable to know the contrast of the image being scanned prior to the image processing stage because, with this knowledge, the image processing system can process the image data so that the reproduced image has the proper contrast. Background detection processes provide one way of obtaining this contrast information prior to further digital image processing.
Conventional automatic background detection processes collect intensity information to create a histogram of the scanned image. The process then identifies a background peak from the histogram, estimates a curve including the peak and calculates the mean and standard deviation. The standard deviation is then used to determine the gain factor for the document. The gain factor is used to compensate for the background gray level of the image of the scanned document. In this manner, the gray level histogram provides an easy to read measure of the image contrast from which a background value can be easily generated. However, it should be noted that the background value is only as accurate as the histogram from which it is generated. Therefore, when generating a histogram to determine the background level of a scanned image, one must be certain to sample only those pixels which are from within the document area.
In conventional systems, background detection is performed by sampling pixel values either within a sub-region of the document (typically the leading edge) or across the whole document (page). These approaches typically rely on a predefined measure of scanned image size and shape which may not reflect the actual size and shape of the scanned document. Thus, while these approaches produce reasonable results when the predefined measure accurately reflects the size and shape of the scanned document, the approaches may fail to accurately measure the background if the scanned document is not the same size as the predefined measure or if the scanned document is positioned such that predefined measure includes background areas other than that of the document (e.g., platen cover).
For example, consider scanning a document from a platen with a white or light gray platen cover. When the document to be scanned is smaller than the predefined measure, the histogram generated would contain gray level values corresponding to the white platen cover in addition to the gray level values of the document. If enough of the platen cover is included in the histogram, the background value detected would be incorrect. Therefore, it is desirable to utilize a background detection process that can differentiate gray level information obtained from non-document areas from the gray level information corresponding to the document""s background. When utilizing such a process, the background value will reflect the value of the document and not the gray level of non document areas, and thus, the output copy from the printing device will not realize a loss of image quality.
In accordance with one aspect of the present invention, there is provided a method for generating background statistics for a scanned document. The method includes the steps of (a) determining a full page background statistic from selected pixels within a document area; (b) determining a sub-region background statistic from selected pixels within a sub-region of the document area; (c) determining if the sub-region background statistic corresponds to image data from a non-document area; (d) determining if the full page background statistic is corrupted; and (e) generating a validated full page background statistic if the full page background statistic is corrupted.
Pursuant to another aspect of the present invention, there is provided a method of generating background statistics that distinguishes between gray level information from document and non-document areas. The method includes generating a full page background statistic from pixels within a document area; generating a first sub-region background statistic from pixels within a first sub-region of the document area; generating a second sub-region background statistic from pixels within a second sub-region of the document area; determining if the first sub-region background statistic corresponds to gray level data from a non-document area; making a first determination of whether the full page background statistic is corrupted and, if so, generating a validated full page background statistic; determining if the second sub-region background statistic corresponds to gray level data from a non-document area; and making a second determination of whether the full page background statistic is corrupted and, if so, generating a validated full page background statistic.