Optical scanners are used to capture and digitize images. For example, an optical scanner can be used to capture the image of printed matter on a sheet of paper. The digitized image can then be electronically stored and/or processed with character recognition software to produce ASCII text. Most optical scanners use illumination and optical systems to illuminate the object and focus a small area of the illuminated object, usually referred to as a "scan line," onto the optical photosensor array. The entire object is then scanned by sweeping the illuminated scan line across the entire object, either by moving the object with respect to the illumination and optical assemblies or by moving the illumination and optical assemblies relative to the object.
A typical scanner optical system will include a lens assembly to focus the image of the illuminated scan line onto the surface of the optical photosensor array. Depending on the particular design, the scanner optical system may also include a plurality of mirrors to "fold" the path of the light beam, thus allowing the optical system to be conveniently mounted within a relatively small enclosure.
While various types of photosensor devices may be used in optical scanners, a commonly used sensor is the charge coupled device or CCD. As is well-known, a CCD may comprise a large number of individual cells or "pixels," each of which collects or builds-up an electrical charge in response to exposure to light. Since the size of the accumulated electrical charge in any given cell or pixel is related to the intensity and duration of the light exposure, a CCD may be used to detect light and dark spots on an image focused thereon. In a typical scanner application, the charge built up in each of the CCD cells or pixels is measured and then discharged at regular intervals known as exposure times or sampling intervals, which may be about 5 milliseconds or so for a typical scanner. Since the charges (i.e., image data) are simultaneously collected in the CCD cells during the exposure time, the CCD also includes an analog shift register to convert the simultaneous or parallel data from the CCD cells into a sequential or serial data stream.
A typical analog shift register comprises a plurality of "charge transfer buckets" each of which is connected to an individual cell. At the end of the exposure time, the charges collected by each of the CCD cells are simultaneously transferred to the charge transfer buckets, thus preparing the CCD cells for the next exposure sequence. The charge in each bucket is then transferred from bucket to bucket out of the shift register in a sequential or "bucket brigade" fashion during the time the CCD cells are being exposed to the next scan line. The sequentially arranged charges from the CCD cells may then be converted, one-by-one, into a digital signal by a suitable analog-to-digital converter.
In most optical scanner applications, each of the individual pixels in the CCD are arranged end-to-end, thus forming a linear array. Each pixel in the CCD array thus corresponds to a related pixel portion of the illuminated scan line. The individual pixels in the linear photosensor array are generally aligned in the "cross" direction, i.e., perpendicular to the direction of movement of the illuminated scan line across the object (also known as the "scan direction"). Each pixel of the linear photosensor array thus has a length measured in the cross direction and a width measured in the scan direction. In most CCD arrays the length and width of the pixels are equal, typically being about 8 microns or so in each dimension.
The sampling rate in the cross direction is a function of the number of individual cells in the CCD. For example, a commonly used CCD photosensor array contains a sufficient number of individual cells or pixels to allow a sampling rate in the cross direction of about 600 pixels, or dots, per inch (600 ppi), which is referred to herein as the native sampling rate in the cross direction.
The sampling rate in the scan direction is inversely related to the product of the scan line sweep rate and the CCD exposure time (i.e., the sampling interval). Therefore, the sampling rate in the scan direction may be increased by decreasing the scan line sweep rate, the CCD exposure time, or both. Conversely, the sampling rate in the scan direction may be decreased by increasing the scan line sweep rate, the CCD exposure time, or both. The "minimum sampling rate in the scan direction" for a given exposure time is that sampling rate achieved when scanning at the maximum scan line sweep rate at that exposure time. For example, a maximum scan line sweep rate of about 3.33 inches per second and a maximum exposure time of about 5 milliseconds will result in a minimum sampling rate in the scan direction of about 60 ppi.
Currently, optical character recognition (OCR) requires 300 ppi sampling rates for accurate results. Thus, a 300 ppi 4 bit gray scan (8.5.times.11), which is high resolution, low bit depth, is approximately 4.2 Megabytes. Color fidelity requires a 24 bit color scan. Thus, a 150 ppi 24 bit color scan (8.5.times.11), which is low resolution, high bit depth, is approximately 6.3 Megabytes. In order to provide a scan of a document that has both color pictures or drawings and writing requiring OCR, the scan would have to be approximately 300 ppi at 24 bits (8.5.times.11) which corresponds to 25.24 Megabytes of memory. Accordingly, to scan a document that includes both text and pictures would require quite a bit of memory. Yet, the software on the computer will down sample the color image to approximately 6.3 Megabytes and throw away the color image to obtain the text. This process is extremely slow to perform in software and unnecessarily consumes a great deal of memory. Another alternative is to first scan either the text or the graphics and then perform a scan of the other. Then the document could be regenerated by software. However, this is also a very time consuming method of scanning the document, besides using a lot of memory as well.
Accordingly, it would be desirable to provide a scanner that is able to scan a document containing both text and graphics, and greatly reduce the total amount of data being sent from the scanner to the host computer (which is currently a speed constraint), and reduce the total amount of data being stored and processed by the host computer software.