This invention relates generally to compression of digital images and in particular to methods of digital image compression that store or transmit compressed image data in multiple quality scales in a progressive manner.
Digital storage and display of high quality color images has become ubiquitous. In order to overcome massive storage requirements and reduce transmission time and cost of high quality digital images, data compression methods have been developed. In particular, the method known as JPEG and the recent update known as JPEG2000 have become industry standards. Data compression generally involves a tradeoff between data size and reconstructed image quality. When reconstructed images differ from the original image, the data compression method is said to be xe2x80x9clossy.xe2x80x9d
As is well known, in the basic JPEG method, an image is transformed into a luminance/chrominance color representation conventionally denoted as YUV or YCbCr, where Y is a primary color or luminance component and U and V or Cb and Cr are secondary color components. The number of secondary components stored is reduced by averaging together groups of pixels. The pixel values for each component are grouped into blocks and each block is transformed by a discrete cosine transform (DCT). In each block, the resulting DCT coefficients are quantized, that is divided by a predetermined quantization coefficient and rounded to integers. The quantized coefficients are encoded based on conditional probability by Huffman or arithmetic coding algorithms known in the art. A normal interchange JPEG file includes the compression parameters, including the quantization tables and encoding tables, in the file headers so a decompressor program can reverse the process.
Optional extensions to the minimum JPEG method include a progressive mode intended to support real time transmission of images. In the progressive mode, the DCT coefficients may be sent piecemeal in multiple scans of the image. With each scan, a decoder can produce a higher quality rendition of the image. However, in most implementations, the same number of pixels is used at each level of quality.
Despite the widespread implementation of the JPEG and JPEG2000 methods, each method has its own drawbacks. The major problems in JPEG compression include a moderate compression ratio, a block effect, and poor progressive image quality. A major step used in JPEG to achieve reasonable data compression is to quantize the DCT coefficients. However, light quantization leads to a low compression ratio while heavy quantization leads to block effects in which block boundaries can be seen in reconstructed images. Using the JPEG method, image quality does not degrade gracefully with compression ratio. Therefore, a progressively decoded JPEG image is not pleasing to the viewer until the last scan of the image is decoded.
JPEG2000 is designed to overcome some of the drawbacks of JPEG. JPEG2000 uses a wavelet transform that degrades more gracefully as the compression ratio increases. However, JPEG2000 comes with a price of increased computational complexity. The progression methods employed in JPEG2000 require excessive computational power for both encoding and decoding. While the wavelet transform in JPEG2000 improves quality degradation with respect to compression ratio, it does not improve data compaction intrinsically, such that the compression ratio is about the same as that of JPEG when high quality is required. Further, the context prediction method used for arithmetic coding in JPEG2000 does not take advantage of the fact the colors of objects in a picture are highly correlated.
Therefore, there remain opportunities to improve existing technologies for image compression. It would be desirable to provide a better transform that has fast implementations and makes data more compact. A more efficient and better quality progression method is also desired. Finally, there is an opportunity to utilize color correlation in context prediction and to provide a compression method for color spaces other than the YUV space.
A method of compressing digital representations of images provides the ability to store the images in multiple subsampling quality scales in a progressive manner such that a higher quality scale contains only data incremental to the data in an adjacent lower quality scale. The method can be implemented in software, in dedicated hardware, or in a combination of software and hardware.
The method is primarily applied to three-color images represented in terms of a primary color component and secondary color components, associated with pixels forming a two-dimensional array. Multiple color spaces, for example, the RGB space or the YUV luminance/chrominance color space can be treated. According to the method, first an image is represented in a sequence of quality scales of progressively decreasing quality. In the sequence, a lower quality scale is formed from a higher quality scale by decreasing the number of stored color components or by decreasing the number of pixels of some or all of the color components.
In one useful scale sequence, for the first, that is the highest, quality scale, all color components are present for each pixel. At the second quality scale, the primary color component and one secondary color component are present for each pixel. At the third quality scale, a primary color component is present at each pixel and twice as many primary color components as secondary color components are present. The sequence also includes fourth, fifth, and sixth quality scales derived from the first, second, and third quality scales, respectively, by reducing the number of pixels by a downscaling process. Downscaling processes such as decimation scaling, bilinear scaling, or bicubic scaling may be used.
A second useful scale sequence of quality scales includes the first, second, and third scales described above together with a fourth quality scale in which one color component is present at each pixel location and twice as many primary components as secondary components are present. The latter scale is known as the Bayer pattern.
Each representation at a higher quality scale is represented in terms of a differential with respect to the image at the adjacent lower quality scale. Each differential image contains only data incremental to the corresponding lower quality scale. The differential images are determined from reconstructed images at the adjacent lower quality scale which avoids accumulation of error. The original representation is thus transformed into the representation at the lowest quality scale plus the differential images.
As part of the process of representing the image as differentials, the base quality scale image and the differential images are transformed into a set of coefficients associated with known functions. In typical implementations, the lowest quality scale representation and the differential images are each divided into blocks before the transform stage. In conventional JPEG methods, a discrete cosine transformation is used. According to an aspect of the present invention, a transformation termed the discrete wavelet cosine transformation (DWCT) which combines the frequency transformation features of a discrete cosine transformation and the spatial transformation, multi-resolution features of the Haar wavelet transformation may be used. The DWCT is defined recursively from a discrete cosine transform and a permutation function whereby output elements of the transform are separated into a first portion and a second portion, the first portion containing lower scales of representation of input to the transform. The DWCT transformation is both faster than conventional wavelet transformations and provides a better compaction of coefficient values than previously used transformations. The DWCT coefficients are quantized by dividing by values specified in quantization tables and rounding to integers.
Quantized coefficients corresponding to the base quality scale and the differential images are compressed by a lossless ordered statistics encoding process. The ordered statistics encoding process includes the stages of context prediction, ordering the two-dimensional array into a one-dimensional array, and arithmetic encoding. According to another aspect of the invention, the process of context prediction, that is predicting the value of each coefficient from the values of coefficients at neighboring pixels, predicts each color component separately. For the primary color component, the context for a given pixel comprises a positional index and neighboring coefficients of primary color pixels. For a first secondary color, the context comprises a positional index, coefficients of neighboring first secondary color components, and the coefficient of the corresponding primary color component of the same positional index. For a second secondary color component, the context comprises a positional index, neighboring second secondary color coefficients, and the coefficients of the corresponding primary and first secondary color components of the same positional index. In the present context prediction method, the coefficients are divided into four groups based on position in the array and the position of neighboring coefficients used for context prediction differs for each group.
According to yet another aspect of the present invention, an ordering process defined here as the quad-tree ordering method is used to maximize data correlation. In the quad-tree ordering method, the two-dimensional array of coefficients is partitioned into four equally sized regions ordered as upper left, upper right, lower left, and lower right. Each region is repeatedly partitioned into four equally sized subregions ordered as upper left, upper right, lower left, and lower right until a subregion of one pixel by one pixel in size is obtained. Ordering can be done before quantization or context prediction as long as the mapping is preserved for all relevant data such as coefficients, quantization tables, and contexts. The context-predicted, ordered coefficient values are then encoded using a lossless encoding method, for example an arithmetic encoding method.
The present compression process produces a bitstream that can be efficiently stored or transmitted over a computer network. A decompression process essentially reverses the process and thus enables the image to be reconstructed. An image compressed according to the present process can be progressively viewed or downloaded by transmission over a computer network or the Internet. Further, a browser can display an image at a specified quality scale, ignoring any data corresponding to quality scales higher than the specified scale.