Digital image compression methods are used widely to compress many types of documents. For example, digital image compression is often used in facsimile machines. Future applications of digital image compressor may include digital copiers and document image archiving systems. Some document images may include sharp regions containing text and regions containing smooth continuous-tone images. The image compression methods of the prior art attempt to provide high quality and high compression ratio for both sharp text regions and smooth continuous-tone regions in a document. These documents are hereinafter referred to as document images.
Two well-known types of prior art image compression methods are lossless binary image compression methods and lossy continuous-tone compression methods. An example of lossless binary image compression is the JBIG standard. An example of a lossy continuous-tone compression method is the baseline Joint Photographic Expert Group (JPEG) standard. However, each of these methods when used alone do not provide high quality and high compression ratio for both text and continuous-tone regions. Specifically, binary image compression methods alone, such as those used for facsimile, cannot describe continuous-tone regions. On the other hand, transform-based continuous-tone compressors typically use lossy compression methods that have difficulty representing the high frequency edges found in text or line-art and graphics. For instance, the JPEG continuous-tone still image compression standard uses a discrete cosine transform (DCT) to achieve energy compaction. High contrast edges present in text are not modeled well by cosine-based functions. When using JPEG with quantization, the compression of the image regions produces a high compression ratio. However, undesirable noise is created in text regions at the same time, sometimes called artifacts. This noise is high frequency ringing and is sometimes referred to as "mosquito noise". Mosquito noise is caused by excessive quantization of the high spatial frequency content of the edges in the text. It is desirable to be able to compress (and decompress) documents having text and continuous-tone images such that the mosquito noise is eliminated or reduced.
A prior art technique for achieving good performance when compressing document images is referred to as post-processing. Post-processing is used to reduce noise created by quantization in a continuous-tone compressor. For instance, JPEG noise in text regions can be reduced by a post-processing after decompression. Note that in the prior art, linear, non-linear and/or adaptive filtering are used in post-processing as well as model based approaches. The primary disadvantage of post processing is that since the correct image is not available to the post-process, it must estimate what is noise and what is the proper image. Poor estimates will result in poor image quality. In some prior art applications, such as document archiving, an image is compressed once but decompressed many times, such that the overall cost of post-processing is multiplied. Due to these disadvantages, it is desirable to provide for compression of a document without having to perform post-processing on the image data during decompression.
Another prior art approach to achieving good performance when compressing document images is referred to as pre-processing. In pre-processing, the lossy compressor only processes data that it handles well, while other compression methods handle the remaining data. For instance, pre-processing can be used to separate the information in the image so that one type of compressor compresses one type of information and another compressor compresses the remaining information. One prior art method of pre-processing is referred to as segmentation, wherein the image segmented into text regions and image regions. Explicit segmentation is described in N. Kuwata, Y. Murayama, S. Ilno, "A New Bi-Level Quantizing Method For Document Images," IEEE Transactions on Consumer Electronics, Vol. 38, No. 3, (August 1992). However, segmentation is difficult to do quickly and accurately. For some documents, spatial segmentation is reasonable, because text regions consist of fully saturated background and foreground colors and images are spatially separated from text. However, other documents have images as background for text or text that does not use a single color for the foreground. Thus, some documents contain regions that are impossible to segment properly. In cases where spatial segmentation is difficult or impossible, good compression cannot be obtained with methods requiring segmentation. A scheme with segmentation must either tolerate mistakes due to incorrect segmentation or must compress the entire image with multiple techniques to make segmentation mistakes user correctable. Because of these limitations, it is desirable to perform compression (and decompression) on document images using a scheme that avoids having to segment the document images into text and continuous-tone regions.
The present invention uses a multistage approach that combines two compression schemes without explicit spatial segmentation to provide compression of document images. First, in the present invention, an image that may be compressed by a compression scheme is created. For document images, the present invention produces an approximation of the image that describes the binary information in the image. The binary information (i.e., the approximation) can be compressed with a binary or model based compressor. Next, the image representing the binary information is subtracted from the original to create a difference image. This results in a difference image that contains little of the high frequency edge information that was present in the original image. The difference image is compressed by a continuous-tone image compressor, such as JPEG. Similarly, in the present invention, decompression comprises summing the decompressed binary information and the decompressed difference image to form the original document image.