This invention relates generally to the field of image compression and in particular to an improved wavelet coefficient ordering mechanism.
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. The following notice applies to the software and data as described below and in the drawing hereto: Copyright(copyright) 1998, Microsoft Corporation, All Rights Reserved.
Digital pictures are used in many applications, such as Web pages, CD-ROM encyclopedias, digital cameras, and others. In most cases is necessary to compress the pictures, in order for them to fit into a small amount of storage or to be downloaded in a short amount of time. For example, in a typical digital camera, pictures are taken at a resolution of 1024xc3x97768 picture elements (pixels), with a resolution of 12 to 24 bits per pixel. The raw data in each image is therefore around 1.2 to 2.5 megabytes. In order to fit several pictures in a computer diskette, for example, it is necessary to reduce the amount of data used by each picture. The larger the compression ration that is achieved, the more pictures will fit into a diskette or memory card and the faster they can be transferred via bandwidth limited transmission medium such as telephone lines.
Image compression has been extensively studied over the past twenty years. The JPEG standard, defined by the JPEG point photographic experts group) committee of ISO (International Standards Organization), was defined in 1992 and is the most popular method of compressing digital pictures. In JPEG, small square blocks of pixels (of dimensions 8xc3x978) are mapped into the frequency domain by means of a discrete cosine transform (DCT). The DCT coefficients are quantized (divided by a scale factor and rounded to the nearest integer) and mapped to a one-dimensional vector via a fixed zigzag scan pattern. That vector is encoded via a combination of run-length and Huffman encoding.
The independent processing of small 8xc3x978 blocks in JPEG is an advantage from an implementation viewpoint, especially in low-cost hardware. However, it also leads to the main problem with JPEG: blocking artifacts. Because the quantization errors from adjacent blocks are uncorrelated among blocks but correlated within the blocks, the boundaries of the 8xc3x978 blocks becomes visible in the reconstructed image due to the potential difference in encoding between adjacent blocks. Such artifacts are referred to as tiling or blocking artifacts which can be reduced (but not completely eliminated) by using transforms with overlapping basis functions.
An efficient way to remove the blocking artifacts is to replace the block DCT by a wavelet decomposition, which provides an efficient time-frequency representation. Very good compression performance can be obtained by quantizing and encoding wavelet coefficients.
Many wavelet-based image compression systems have been reported in the technical literature in the past few years. With wavelets it is possible to achieve compression ratios that typically range from 20% to 50% better than JPEG. More importantly, wavelet transforms lead to pictures that do not have the disturbing blocking artifacts of JPEG. Therefore, wavelet-based transforms are becoming increasingly popular. In fact, in the next revision of JPEG, named JPEG2000, all proposals under consideration use wavelets.
Some prior wavelet transforms decompose images into coefficients corresponding to 16 subbands. This results in a four by four matrix of subbands, referred to as a big block format, representing spectral decomposition and ordering of channels. The letters L and H are used to identifying low pass filtering and high pass filtering respectively for each subband. The first subband comprises LL and HL coefficients, where the first letter in each set correspond to horizontal filtering and the second corresponds to vertical filtering. Two stages are used in each subband filtering combination. The ordering corresponds to frequencies increasing from left to right and from bottom to top. This ordering is fixed to allow both encoding and decoding to function in a fixed manner. Quantization of the coefficients is then performed, followed by some form of compressive encoding of the coefficients, including adaptive Huffman encoding or arithmetic encoding to further compress the image. These forms of encoding can be quite complex, including zero tree structures, which depend on the data types. These encoders are fairly complex, and many need to be modified for different images to be compressed, making them difficult to implement in hardware.
While wavelet compression eliminates the blocking and ghost or mosquito effects of JPEG compression, there is a need for alternative ways to transform images to the frequency domain, including methods that are simple to implement, and may be implemented in either hardware or software.
Reordering of quantized wavelet coefficients is performed to cluster large and small wavelet coefficients into separate groups without requiring the use of data-dependent data structures. Entropy encoding of quantized wavelet coefficients clusters large and small wavelet coefficients into separate groups without requiring the use of data-dependent data structures such as zertotrees, or a separate list for set partitions in trees. It thus lends itself to easier hardware or software implementations. Further advantages include the elimination of blocking artifacts, and single pass encoding for any desired compression ratio. The coefficients are reordered into blocks such that a matrix of indicies contains the coarsest coefficients in the upper left corner, and filling in low-high and high-low sub bands in larger and larger blocks in an alternating manner, such that low-high sub bands comprise the top of the matrix and the high-low sub bands comprise the left side of the matrix. This type of clustering produces coefficients that have probability distributions that are approximately Laplacian (long runs of zeros for example).
A decoder applies the above in reverse order. Unshuffling of the coefficients is performed to obtain the original scan order. The unshuffled coefficients are then subjected to an inverse wavelet transform to recover the transformed and compressed data, such as image pixels. To decode at a lower resolution, one simply drops finer sub bands.
By not requiring the use of data-dependent data structures such as zero trees, or a separate list for set partitions in trees, hardware implementations are easier to build and software implementations may run faster.