The discrete wavelet transform (DWT) has recently received considerable attention in the context of image processing due to its flexibility in representing nonstationary image signals and its ability in adapting to human visual characteristics. Its relationships to the Gabor transform, windowed Fourier transform and other intermediate spatial-frequency representations have been studied. The wavelet representation provides a multi-resolution/multi-frequency expression of a signal with localization in both time and frequency. This property is very desirable in image and video coding applications. First, natural image and video signals are nonstationary in nature. A wavelet transform decomposes a nonstationary signal into a set of multiscaled wavelets where each component becomes relatively more stationary and hence easier to code. Also, coding schemes and parameters can be adapted to the statistical properties of each wavelet, and hence coding each stationary component is more efficient than coding the whole nonstationary signal. In addition, the wavelet representation matches to the spatially-tuned, frequency modulated properties experienced in early human vision as indicated by the research results in psychophysics and physiology.
The discrete wavelet theory is found to be closely related to the framework of multiresolution analysis and subband decomposition. In the multiresolution analysis, an image is represented as a limit of successive approximations, each of which is a smoothed version of the image at the given resolution. All the smoothed versions of the image at different resolutions form a pyramid structure. An example is the so called Gaussian pyramid in which the Gaussian function is used as the smoothing filter at each step. However, there exists some redundancies among different levels of the pyramid. A Laplacian pyramid is formed to reduce the redundancy by taking the difference between the successive layers of the Gaussian pyramid. The Laplacian representation results in a considerable compression although the image size actually expands after the decomposition. In subband coding, the frequency band of an image signal is decomposed into a number of subbands by a bank of bandpass filters. Each subband is then translated to a baseband by down-sampling and encoded separately. For reconstruction, the subband signals are decoded and up-sampled back to the original frequency band by interpolation. The signals are then summed up to give a close replica of the original signal. The subband coding approach provides a compression performance comparable to the transform coding approach and yields a superior subjective perception due to the lack of the "block effect". The multiresolution approach and subband approach were recently integrated into the framework of the wavelet theory. The wavelet theory provides a systematic way to construct a set of perfect-reconstruction filter banks with a regularity condition and compact support. In the wavelet representation, the overall number of image samples is conserved after the decomposition due to the orthogonality of wavelet basis at different scales. Wavelet theory has been applied to image coding in a similar way to the subband coding approach.
Natural video signals are nonstationary in nature. In the transform coding approach such as in the CCITT H.261 the ISO/MPEG proposal, the residual video signals are divided into many small blocks. The reason being that with small block size, it becomes feasible and advantageous to be implemented in hardware, as well as the nonstationarity of each block in the residual frame is reduced. The block transform coding approach suffers from the "blocking effect" in low bit rate applications. The wavelet decomposition provides an alternative approach in representing the nonstationary video signals and the residual signals after prediction. Compared with transform coding, the wavelet representation is more flexible and can be easily adapted to the nature of human visual system. It is also free from blocking artifacts due to the nature of its global decomposition. After wavelet decomposition, each scaled wavelet tends to have different statistical properties.