Digital scanners are commonly used to capture images from a hardcopy medium. In a typical scanning operation, light from the scanner illuminates the surface of an original document and an image sensor moving past the document detects the intensity of light reflected from each location in the image and stores it as a proportionate electrical charge at a corresponding pixel location. The analog charges are passed to an image processor where they are quantized to grayscale levels, each of which is represented by a multi-bit digital value. The number of bits that are assigned to each grayscale level determines the number of intensity levels that can be generated by the scanner. For example, in a scanner that represents grayscale levels using 8 bit words will be able to capture 256 (28) different intensity levels. The value for the grayscale level that provides the closest match to the intensity of reflected light is assigned to the pixel corresponding to each location in the image. Thus, scanning captures analog input images by generating a stream of multi-bit values, with each location in the image being represented by a multi-bit digital word.
One or more scanners, printers, video displays and/or computer storage devices are often connected via a communications network, thereby providing a digital reproduction system. For example, a digital copier may incorporate a scanner and a digital printer. While scanners capture hundreds of light intensity levels, digital output devices usually generate relatively few levels of output. For example, most digital printers generate binary output, wherein a single bit is assigned to each pixel and marking material is either withheld from or applied to the pixel depending upon the assigned value. The grayscale data is rendered to binary format and stored in memory, where it can be retrieved by the printer for output. While it is possible to print data as it is rendered, storing it first provides several advantages. For one, when the data is stored, it is possible to print multiple copies of the same page without having to repeatedly re-scan the original document. It is also easier to transfer stored data between devices, as it can be compressed and decompressed.
Grayscale image data is often processed for improved image quality. Image processing is preferably applied before the image is rendered, to avoid data loss. Well known image processing techniques are performed to improve image contrast, sharpness, color and also to eliminate scanning artifacts, hole punches and other undesirable data. For example, skew correction is a well known imaging process that may be applied to remove skew from image data that is captured from an original document that became rotated relative to the image sensor before it was captured. Skewed images are unappealing to the viewer and they are also difficult to process in optical character recognition processes. The contents of an image are viewed in relation to the edges of the page on which it is printed. Thus, skew can be eliminated by aligning the image with the edge of the document. More specifically, skew can be eliminated by detecting the magnitude and direction of the document rotation and applying a corresponding counter rotation to the image data.
Cropping, another well known imaging process, is performed to remove image data that represents the document transport, scanner platen or other hardware that is present in the scanning when the document is scanned. To remove this extraneous data, the entire scan is processed to determine the size of the original document and to pinpoint its location inside the scan. The data that lies outside of the identified region can then be deleted before the image is printed.
An accurate and robust digital image processing technique analyzes the entire scan to select the data that is most relevant for processing. However, to process an entire scan, all of the grayscale data must be available when each scanline is processed. One-pass scanners process image data “on-the-fly,” i.e., the grayscale data is generated, processed and rendered in real-time. It would be very expensive to process and store the entire volume of grayscale image data for an entire scan (i.e., multi-bit grayscale values for every pixel in the image) quickly enough to keep pace with the scanning rate. Instead, one-pass scanners almost always select the data on which processing is based by analyzing only a subset of the data in the scan. Unfortunately, it is difficult to isolate the subset of grayscale data that will best represent the entire scan for each specific process and thus, image processing that is based upon partial analyses often produce unreliable results.
Therefore, it is desirable to provide systems and methods for processing a digital image based upon an analysis of the grayscale image data that is generated for an entire scan. More specifically, it is desirable to provide systems and methods for processing grayscale image data for an entire scan to locate the corners of the document without having to store the entire grayscale image in memory. It is also desirable to process grayscale data to measure the skew in a scanned image and to eliminate the skew before the image is rendered for output.