Documents containing both text and pictures, known as compound documents, are becoming more prevalent. Previously, documents often consisted exclusively of text or exclusively of pictures (i.e., halftones). Pictures, as used herein, refer to photographs, naturalistic artwork, and graphical material. Text includes lettering, certain line drawings, and certain patterns. In order to represent compound documents electronically, it is desirable to have the ability to compress the image data corresponding to the document. Compression saves on storage space and allows the data to be more quickly transmitted, whether the purpose is photocopying a document, sending image data to a printer, or saving and sending image data via e-mail or facsimile.
Many different compression algorithms exist, some standard and some proprietary. In general, certain compression algorithms are better suited to text while other compression algorithms are better suited to pictures.
JPEG (Joint Photographic Experts Group) is the name of a committee and the name of the international standard adopted by that committee which applies to the compression of graphic images (pictures). The JPEG standard is one of the most popular and comprehensive continuous tone, still frame compression standards. JPEG defines three different coding systems: (1) a lossy baseline coding system, which is based on a discrete cosine transform (DCT); (2) an extended coding system for greater compression and progressive reconstruction applications; and (3) a lossless independent coding scheme for reversible compression. In order to be JPEG compliant, a product or system must include support for the lossy baseline coding system.
Lossy image compression refers to a technique wherein the compressed data cannot be decompressed into an exact copy of the original image, i.e., there is a loss of quality of the final image. An important goal in lossy image compression is to achieve maximum compression while still obtaining high image quality of the decompressed image. In order to provide acceptable image quality in the decompressed image, in general, a greater amount of compression is possible for pictures or halftones as compared to text. Too great of a compression amount for text often introduces unacceptable artifacts into the decompressed image.
In the JPEG lossy baseline system, compression is performed in three sequential steps: DCT computation, coefficient quantization, and finally lossless compression.
The image is first divided into non-overlapping blocks of size 8 by 8 pixels, which are processed in an order from left to right, top to bottom. After a normalization step, a two-dimensional DCT is applied to each block. This transform, similar to a Fourier transform, produces a transformed block (matrix) in the frequency domain. The first coefficient (location 0,0) in the transformed block is a constant that represents the average or DC component of the 64 image elements (pixels) included in each image block. The remaining coefficients describe higher frequencies found in the block.
The DCT coefficients are then quantized using a defined quantization table and reordered using a zigzag pattern to form a one-dimensional sequence of quantized coefficients. Lossless entropy coding, such as Huffman coding, is then applied to the resulting sequence to produce the compressed data.
Although there are a number of settings that can be predefined to achieve different compression ratios, one parameter, called the quality factor, can be adjusted in JPEG compression. The quality factor is a single number in an arbitrary, relative scale and is often adjusted on an image-by-image basis. A higher quality factor will provide a relatively high quality decompressed image, but will require a relatively large file (less compression). A lower quality factor will provide greater compression with a correspondingly smaller file size. However, there may be more visible defects or artifacts in the decompressed image. Generally, pictures can be compressed to a greater degree as compared to text, in order to maintain acceptable decompressed image quality.
U.S. Pat. No. 6,314,208 describes an image compression system that can be used to apply different quantization factors to blocks of picture and text to provide significant image compression. The quantization factors are selected by examining the DCT coefficients in the transformed block and estimating metrics that would indicate the presence of text versus pictures.