1. Field of the Invention
The present invention generally relates to encoding and decoding of digital image data to and from a compressed form while applying corrections to enhance image quality and, more particularly, to the encoding and decoding of documents for extreme data compression to allow economically acceptable long-term storage in rapid access memory while adaptively determining an optimal correction to be applied to enhance image quality.
2. Description of the Prior Art
Pictorial and graphics images contain extremely large amounts of data and, if digitized to allow transmission or processing by digital data processors, often requires many millions of bytes to represent respective pixels of the image or graphics with good fidelity. The purpose of image compression is to represent images with less data in order to save storage costs or transmission time and costs. The most effective compression is achieved by approximating the original image, rather than reproducing it exactly. The JPEG (Joint Photographic Experts Group) standard, discussed in detail in “JPEG Still Image Data Compression Standard” by Pennebaker and Mitchell, published by Van Nostrand Reinhold, 1993, which is hereby fully incorporated by reference, allows the interchange of images between diverse applications and opens up the capability to provide digital continuous-tone color images in multi-media applications.
JPEG is primarily concerned with images that have two spatial dimensions, contain gray scale or color information, and possess no temporal dependence, as distinguished from the MPEG (Moving Picture Experts Group) standard. JPEG compression can reduce the storage requirements by more than an order of magnitude and improve system response time in the process. A primary goal of the JPEG standard is to provide the maximum image fidelity for a given volume of data and/or available transmission or processing time and any arbitrary degree of data compression is accommodated. It is often the case that data compression by a factor of twenty or more (and reduction of transmission time and storage size by a comparable factor) will not produce artifacts or image degradation which are noticeable to the average viewer.
Of course, other data compression techniques are possible and may produce greater degrees of image compression for certain classes of images or graphics having certain known characteristics. The JPEG standard has been fully generalized to perform substantially equally regardless of image content and to accommodate a wide variety of data compression demands. Therefore, encoders and decoders employing the JPEG standard in one or more of several versions have come into relatively widespread use and allow wide access to images for a wide variety of purposes. Standardization has also allowed reduction of costs, particularly of decoders, to permit high quality image access to be widely available. Therefore, utilization of the JPEG standard is generally preferable to other data compression techniques even though some marginal increase of efficiency might be obtained thereby, especially for particular and well-defined classes of images.
Even though such large reductions in data volume are possible, particularly using techniques in accordance with the JPEG standard, some applications require severe trade-offs between image quality and costs of data storage or transmission time. For example, there may be a need to store an image for a period of time which is a significant fraction of the useful lifetime of the storage medium or device as well as requiring a significant amount of its storage capacity. Therefore, the cost of storing an image for a given period of time can be considered as a fraction of the cost of the storage medium or device and supporting data processor installation, notwithstanding the fact that the image data could potentially be overwritten an arbitrarily large number of times. The cost of such storage is, of course, multiplied by the number of images which must be stored.
Another way to determine the storage cost versus image quality trade-off is to determine the maximum cost in storage that is acceptable and then determine, for a given amount of quality, how long the desired number of images can be saved in the available storage. This is a function of the compressed size of the images which generally relates directly to the complexity of the images and inversely with the desired reconstruction quality.
An example of such a demanding application is the storage of legal documents which must be stored for an extended period of time, if not archivally, especially negotiable instruments such as personal checks which are generated in large numbers amounting to tens of millions daily. While the initial clearing of personal checks and transfer of funds is currently performed using automated equipment and is facilitated by the use of machine readable indicia printed on the check, errors remain possible and it may be necessary to document a particular transaction for correction of an error long after the transaction of which the check formed a part.
As a practical matter, the needed quality of the image data also changes over time in such an application. For example, within a few months of the date of the document or its processing, questions of authenticity often arise, requiring image quality sufficient to, for example, authenticate a signature, while at a much later date, it may only be necessary for the image quality to be sufficient to confirm basic information about the content of the document. Therefore, the image data may be additionally compressed for longer term storage when reduced image quality becomes more tolerable, particularly in comparison with the costs of storage. At the present time, personal check images are immediately stored for archival purposes on write-once CD-ROM or other non-modifiable media and saved, for legal reasons, for seven years. The same data is available for only a few months in on-line, rapid-access storage.
Personal checks, in particular, present some image data compression complexities. For example, to guard against fraudulent transactions, a background pattern of greater or lesser complexity and having a range of image values is invariably provided. Some information will be printed in a highly contrasting ink, possibly of multiple colors, while other security information will be included at relatively low contrast. Decorations including a wide range of image values may be included. Additionally, hand-written or printed indicia (e.g. check amounts and signature) will be provided with image values which are not readily predictable.
Even much simpler documents may include a variety of image values such as color and shadings in letterhead, high contrast print, a watermark on the paper and a plurality of signatures. This range of image values that may be included in a document may limit the degree to which image data may be compressed when accurate image reconstruction is necessary. Therefore that cost of storage in such a form from which image reconstruction is possible with high fidelity to the original document is relatively large and such costs limit the period for which such storage is economically feasible, regardless of the desirability of maintaining such storage and the possibility of rapid electronic access for longer periods.
Since such image values must be accurately reproducible and utilization of the JPEG standard is desirable in order to accommodate widespread access and system intercompatibility, substantially the only technique for further reduction of data volume consistent with reproduction with good image fidelity is to reduce the spatial frequency of sampling of the original image. However, sampling inevitably reduces legibility of small indicia, especially at low contrast. Currently, sampling at 100 dots per inch (dpi) or pixels per inch (about a reduction of one-third to one-sixth from the 300 dpi or 600 dpi resolutions of printers currently in common use) is considered to be the limit for adequate legibility of low-contrast indicia on personal checks. The American National Standards Institute (ANSI) standards committee for image interchange recommends 100 dpi as a minimum resolution. Most check applications use either 100 dpi or 120 dpi grayscale images.
Another complicating factor in this process is the process and accuracy of data capture when the document is originally scanned. The invention disclosed in the above-incorporated U.S. Patent application reduced the dynamic range of the data prior to encoding and compression using a first quantization table (referred to hereinafter as Q-table1) and restores the dynamic range of the image by use of a different, second quantization table (referred to hereinafter as Q-table2) for decompression which restores the dynamic range. This technique allows extreme compression since the image values in a document, while arbitrary, will be relatively reduced in number (as compared with, for example, a photograph) and quantization performed to recognize the differences between such image values even when compressed in dynamic range. Entropy encoding provides the use of fewer bits to encode relatively more common image values and relatively greater numbers of each of relatively fewer image values can thus yield extreme compression of the image data. Since the original dynamic range is known (and, for example, brightness range is specified in ANSI for personal checks, although not always followed in practice) the second quantization table used to restore the dynamic range can be derived analytically and refined empirically to yield exceptional performance as long as the original dynamic range is accurately captured.
In practice, however, while a freshly calibrated and well-maintained scanner will perform well to capture the full dynamic range of a document, performance will begin to degrade immediately during use. Specifically, with use, both brightness and contrast will be reduced in the image data captured predominantly due to two causes: 1.) the original document is relatively dark or of reduced brightness such as may be due to paper quality, coloring or discoloring due to age or environmental damage, and/or 2.) the scanner is not performing properly; having drifted out of calibration, having reduced illumination levels, accumulation of dirt, dust or other contaminants, reduced video gain and the like, some of which can even be spectrally selective. All of these effects will tend to reduce average brightness and, hence, dynamic range and contrast. Therefore, the original brightness, dynamic range and constrast of image values is not, in fact, known and the optimum table of dequantization values cannot be a priori known or developed for a given combination of document and scanner. Moreover, it is desirable to restore or enhance the decoded image to, for example, meet a given established standard for visual values which may not, in fact, be met by the original document consistent with an extreme level of data compression to allow long-term storage at economically acceptable cost. Further, it is also desirable in some applications, particularly involving inspection of documents that the correction and/or enhancement be performed during encoding and compression so that standard decoding and decompression processes and apparatus will result in a suitably corrected and enhanced image both for convenience and economy and to avoid or prevent image modification during decoding and presentation.