The present invention relates generally to the field of compression of compound images, and in particular, to a method, structure and program product that compresses the compound image using multiple transforms, but feeding the transformed data to only one entropy encoder.
Image compression is widely used in hardcopy devices (printers, scanners), to reduce both the required memory and the internal transmission bandwidth. Typical pages handled by those devices are compound pages, i.e., pages that contain photographic (contone) images, graphics, and text. It is an established fact that the best compression performance for compound documents is obtained when different compression methods are applied to different image types. See A. Said and A. Drukarev, xe2x80x9cSimplified segmentation for compound image compressionxe2x80x9d 1999 IEEE International Conference on Image Processing, Kobe, Japan, October 1999. For example, the best way to handle photographic images is to use some form of lossy compression, such as JPEG. On the other hand, lossless compression is most suitable for text and graphics, where it produces both the best image quality and the best compression compared to a lossy method.
The need to use various compression methods for a compound page leads to a solution based on using multiple codecs in a device. For example, one can use a JPEG codec (works only with 8xc3x978 block) and a JPEG-LS codec (works only with individual lines) in a device pipeline, and use intelligent segmentation to decide which pages, or which parts of a page, should be handled by which codec. This approach will result in a good compression performance, but will increase complexity of a system, since we will have to support multiple codecs. Moreover, because of the significant differences in the underlying compression algorithms between JPEG and JPEG LS (or other lossless compression methods), it is very difficult to combine them both in the same digitized page, especially in the case of hardware based implementations. One important reason for the difficulty of such a combination is that because the codecs work with different segments (8xc3x978 blocks versus individual lines), following processing for the transforms is incompatible. The most appropriate solution in this case would be to apply either a lossy method or a lossless method to the entire page, or to a page stripe, depending on the page type. However, since the same compression has to be used for groups of page lines, this does not solve the problem when text and photos are side-by-side.
The operation and compression performance of the JPEG standard is as follows. The typical image path for JPEG compression includes the following steps. The image is read and a color transform is performed. For example, an input image in the RGB color space is converted to the YCbCr space. After the color transform, each color component is compressed independently. In many cases, two of the three-color components (the chrominance components, Cb and Cr) are down-sampled by a factor of two or four. Four distinct operations in JPEG compression performed for each color component are as follows:
1. DCT (discrete cosine transformation).
2. Quantization.
3. Entropy coding, including zigzag scanning and Huffman, or arithmetic, coding of the quantized coefficients.
4. Bit stream management, where we write the codewords from the entropy coder into the output bit stream following the syntax rules of the JPEG standard.
JPEG compression performs very well in photographic images. It achieves visually lossless compression at relatively high compression ratios.
Quantization tables, as well as Huffman tables can be designed to achieve optimal performance for different types of photographic images and different compression ratios.
For non-photographic images, such as graphics or text, JPEG standard does not produce very good results. DCT artifacts (distortion, blurring, and other effects not in the original image) are visible even at relatively low compression ratios. The following simple example illustrates the inefficiency of JPEG for compressing text. Consider a bi-level image with the white level mapped to 255 and the black level to zero. A simple mapping from a 24 bits per pixel format for an RGB picture to a bi-level format will offer a compression of 24:1 with no loss in picture quality. It will be a significant challenge for JPEG to achieve visually lossless performance on text at compression ratios of 24:1. The reason for that is the DCT. DCT coding does not perform well in a block with a lot of high frequency content or with only few levels of intensity in the input values.
The present invention comprises, in one embodiment, a method for compressing a compound image, comprising the steps of: forming compound image data into a plurality of blocks; obtaining classifying data that designate one from a plurality of classes for each of a plurality of the blocks, based on predominate compression properties of the block; for each of a plurality of the blocks, obtaining transformed data therefor from either a lossy or a lossless transform selected based on the classifying data for the block; and forwarding that transformed data to one entropy encoder.
In a further aspect of the present invention, the forming page data step comprises forming blocks which are compatible with a JPEG standard.
In a further aspect of the present invention, the obtaining classifying data comprises obtaining classifying data based on information in page description language associated with that block.
In a further aspect of the present invention, the lossy transform is a DCT.
In a further aspect of the present invention, the obtaining classifying data step comprises classifying the blocks based on the parameters of the number of adjacent pixels, C, in a scan line order whose values are different by an absolute difference greater than a predetermined number, the difference D between the minimum and maximum values of pixels in the block being classified, and at least one threshold value T1.
In a further aspect of the present invention, there are at least two threshold values, T1 and T2, and if C is greater than T1, then data from the DCT is used as the transformed data, if C is less than or equal to T1 and D is less than or equal to T2, then data from the DCT is used as the transformed data, and if C is less than or equal to T1 and D is greater than T2, then the data from the lossless transform is used as the transformed data.
In a further aspect of the present invention, the obtaining transformed data step comprises sending the block data through only the lossless transform or the lossy transform based on the classifying data.
In a further aspect of the present invention, the obtaining transformed data step comprises sending the block data through both the lossless transform and the lossy transform and then selecting the data from one of these transforms as the transformed data based on the classifying data.
In a further aspect of the present invention, an extra symbol is added to an entropy table for the entropy encoder for signaling a change among lossy transform and the lossless transform.
In a further aspect of the present invention, the forwarding to an entropy encoder step comprises coding a difference between a previous block and a current block as follows: for a previous block being a DCT block and a current block being a DCT, the difference between the DC value for the current block and the DC value for the previous block is coded ; for a previous block being an LT block and the current block being a DCT, the difference between the DC value of the current DCT block and the y(0) value of the previous LT block that is coded; for a previous block that is a DCT block and the current block being an LT, the difference between the y(0) value and the DC value of the previous block is coded; and for a previous block that is an LT block and a current block that is LT, the difference between the y(0) value for the current block and the y(0) for the previous block is coded.
In a further aspect of the present invention, a first block in an image is a DCT block.
In a further aspect of the present invention, the obtaining transformed data step comprises obtaining transformed data from at least one of a lossy transform, a first lossless transform, or a second lossless transform, based on the classifying data for the transform.
In a further aspect of the present invention, the obtaining classifying data step comprises classifying the blocks based on the parameters of the number of adjacent pixels, C, in a scan line order whose values are different by an absolute difference that is greater than a predetermined threshold, the difference D between the minimum and maximum values of pixels in the block being classified, the number N of colors in a block, and at least two threshold values T1 and T2.
In a further aspect of the present invention, in the obtaining classifying data step, if N is less than or equal to two, then use data from the second lossless transform as the transformed data, else if D is greater than T2 and C is less than or equal to T1, then use data from the first lossless transform as the transformed data, else use the data from the DCT as the transformed data.
In a further aspect of the present invention, for the second lossless transform, the intensity values of the pixels for the blocks are assigned only two levels, V0 and V1, which are then mapped with pixels with intensity value V0 mapped to zero and pixels with value V1 mapped to 1, and then applying a bit-wise XOR operation to the pixels, and performing xe2x80x9cbit packingxe2x80x9d to represent each resultant quantity y(i) with just one bit.
In a further aspect of the present invention, two extra symbols are added to an entropy table for the entropy encoder to signal a change among the lossy transform, the first lossless transform and the second lossless transform blocks.
In a further aspect, the present invention comprises the entropy encoder selecting for use in encoding, one from a plurality of different Huffman tables, based on whether the block is a lossy block or a lossless block.
In a further embodiment of the present invention, a system is provided for compressing a compound image, comprising: a block former for forming compound image data into a plurality of blocks; a first component for obtaining classifying data that designate one from a plurality of classes for each of a plurality of the blocks, based on predominate compression properties of the block; a second component, for each of a plurality of the blocks, obtaining transformed data therefor from either a lossy or a lossless transform selected based on the classifying data for the block; and a single entropy encoder for receiving the transformed data.
In a further embodiment of the present invention, a program product is provided including computer readable program code for compressing a compound image, comprising: first code for forming compound image data into a plurality of blocks; second code for obtaining classifying data that designate one from a plurality of classes for each of a plurality of the blocks, based on predominate compression properties of the block; third code, for each of a plurality of the blocks, obtaining transformed data therefor from either a lossy or a lossless transform selected based on the classifying data for the block; and fourth code for receiving and entropy encoding the transformed data.