1. Field of the Invention
This invention relates to the field of image compression. More particularly, this invention relates to methods and systems for efficiently compressing still images and video frames using wavelet transformation and vector quantization.
2. State of the Art
Image compression may be classified into two categories of encoding: lossless and lossy. Lossless encoding techniques guarantee that the decompressed image from the compressed (encoded) data is identical to the original image. Lossy encoding is generally capable of achieving higher compression ratios versus lossless encoding, but at the expense of some loss of image fidelity. Exemplary conventional lossless image encoding techniques include run-length encoding, Huffman encoding, and Lempel/Ziv encoding. Exemplary conventional lossy image encoding techniques include transform encoding, vector quantization (VQ), segmentation and approximation methods, spline approximation methods and fractal encoding.
Image compression is particularly useful for transmitting and displaying graphical images over the Internet, where it takes more time to transmit the original image than a compressed image. Image compression is also useful for compressing digital video frames. The compression of digital video image frames is particular useful for such applications as video conferencing and video streaming.
The basic idea behind transform coding is decorrelating the original signal so that the signal energy may be redistributed among only a small set of transform coefficients. In this way, many coefficients may be discarded after quantization and before encoding. Generally, transform coding involves four steps: (1) image subdivision, which divides an image into smaller blocks, (2) transformation, such as a Discrete Cosine Transform (DCT) and a wavelet transform, (3) quantization, such as zonal coding and thresholding, and (4) encoding, such as Huffman encoding.
With regard to the quantization, vector quantization is an alternative to scalar quantization and may lead to better performance according to P.I.R.A. International, “Open information interchange study on image/graphics standards,” Tech. Rep. Appendix, PIRA International, June 1993. A vector quantizer is a system which maps a K-dimensional Euclidean space Rk to a finite subset X in Rk made up of N vectors. This subset X becomes the vector codebook. An image can then be represented by the index of the codebook, and thus, be compressed.
Wavelet transforms have been applied to image compression. See e.g., M. Anotonini et al., “Image coding using wavelet transform,” IEEE Transactions on Image Processing, vol. 1, no. 2, pp. 205-220; J. M. Shapiro, “Embedded image coding using zerotree of wavelet coefficients,” IEEE Transactions on Signal Processing, vol. 41, pp. 3445-3462, December 1993; D. Sampson et al., “Wavelet transform image coding using lattice vector quantization,” Electronics Letters, vol. 30, pp. 1477-78, September 1994; and P. C. Cossman et al., “Tree-structured vector quantization with significance map for wavelet image coding,” Proc. 1995 IEEE Data Compression Conf (DCC), March 1995.
Wavelets are mathematic functions that provide joint time-frequency representation of a signal. Wavelets decompose data into different frequency components (known as subbands) and each component can be treated with a resolution matched to its scale. Wavelet transforms with the feature of joint locality can generate “sparse” coefficients, which are particularly useful for image compression. Additionally, the pyramid hierarchy of wavelet decomposition also enables many compression algorithms based on inter-band and cross-band relationships, such as zerotrees. See e.g., Shapiro supra. Although the wavelet transform reduces the correlation between image samples, high-order statistical dependencies still exist within or across subband coefficients. A vector quantizer may exploit these high-order statistical dependencies by jointly quantizing several coefficients. See e.g., A. N. Akansu et al., Subband and Wavelet Transforms: Design and Applications, Kluwer Academic Publishers, Norwell, Mass., 1996.
A wavelet is a mathematical function that satisfies certain mathematical requirements and is used in representing data (signal) or other functions. A wavelet provides an efficient and informative description of a signal, and is superior to the traditional Fourier transform in many fields, especially for fast transient and non-stationary signals. The wavelet has many useful properties, such as joint time (spatial)-frequency localization and multi-resolution representation.
Basis functions for the Fourier transform are sinusoids. In contrast, with the wavelet transform, various basis functions may be designed based on the features of a specific application. Most wavelets do not have analytical solutions. The wavelet transform may be implemented by iterating the quadrature mirror filters in a tree algorithm, as known to one of ordinary skill in the art. However, the wavelet tree algorithm as disclosed in Y. Sheng, The Transforms and Applications Handbook, Chapter “Wavelet Transform,” CRC Press, in cooperation with IEEE, 1996, permits fast wavelet transform, and only requires fewer operations that the conventional fast Fourier transform (FFT).
In most cases images are stored and transmitted by integer or binary format. For hardware implementation of image processing, pure integer operation is often preferred. However, most filters, such as wavelets and wavelet packets, have floating point coefficients. Thus, it is desirable to use an efficient integer implementation of wavelet transform for image coding.
The lifting scheme as disclosed in I. Debauchies et al., “Factoring wavelet transform into lifting steps,” Tech. Rep. Lucent, Bell Laboratories, 1996, the disclosure of which is incorporated herein by reference in its entirety for all purposes, supports perfect reconstruction and fast computation. In the lifting scheme, the wavelet transform is performed in the spatial domain. The basic idea behind the lifting scheme is a predict-update procedure, where the prediction error is related to the high-pass band and the updated prediction is related to the low-pass band.
With the proliferation of digital imagery in many applications including the Internet, there is a need in the art for methods and systems that perform image compression and decompression (coding and decoding) using a combination of lossless and lossy image encoding and decoding to obtain a high peak signal-to-noise ratio (PSNR) and high rates of speed.