A wavelet transform decomposes a still image as a space-frequency domain structure that approximates the variations in the image and closely matches the human visual system that interprets the image. Consequently, wavelet compression techniques have achieved considerable success in the domain of still image compression. For example, the U.S. FBI fingerprint compression standard and the ISO still image compression standard JPEG2000 use wavelet compression techniques.
One type of wavelet transform is an H-transform, which is an extension of the Haar transform. Equation 1 shows a two-dimensional H transform matrix H, which in Equation 2, transforms a space domain, 2×2 matrix A having coefficients a00, a01, a10, and a11 into a wavelet domain matrix having coefficients h0, hx, hy, and hd. For image encoding, coefficients a00, a01, a10, and a11 are pixel values (e.g., color values, gray scale levels, or RGB or YUV color component values) having positions in matrix A according to the positions of corresponding pixels in the image. In the transformed matrix, the wavelet domain value h0 indicates the (low frequency) average spatial energy of matrix A, and wavelet domain values hx, hy, and hd respectively indicate (high frequency) horizontal, vertical, and diagonal variations in matrix A.             Equation   1:        ⁢                                      ⁢          H      =                        1                      2                          ⁢                  (                                                    1                                            1                                                                    1                                                              -                  1                                                              )                                Equation   2:        ⁢                             H      ·              (                                            a00                                      a01                                                          a10                                      a11                                      )            ·              H                  -          1                      =          (                                    h0                                hx                                                hy                                hd                              )      
Although the H-transform is fundamentally a two-dimensional matrix transform, Equation 3 expresses the H-transformation of pixel values to wavelet domain using a single matrix multiplication found by rearranging the pixel values a00, a10, a10, and a11 and wavelet domain values h0, hx, hy, and hd into four component vectors.       Equation   3:        h    =                  (                                            h0                                                          hx                                                          hy                                                          hd                                      )            =                        1          2                ⁢                              (                                                            1                                                  1                                                  1                                                  1                                                                                                  -                    1                                                                    1                                                                      -                    1                                                                    1                                                                                                  -                    1                                                                                        -                    1                                                                    1                                                  1                                                                                                  -                    1                                                                    1                                                  1                                                                      -                    1                                                                        )                    ·                      (                                                            a00                                                                              a01                                                                              a10                                                                              a11                                                      )                              
One still image encoding method using the H-transform breaks an N×N matrix of pixel values representing an image into N/2×N/2 non-overlapped 2×2 matrices and then applies the two-dimensional H-transform to each 2×2 matrix of pixel values. The wavelet transform components for the entire N×N matrices can then be rearranged to construct an array 100 such as illustrated in FIG. 1. As shown, array 100 has low frequency components h0 arranged in an N/2×N/2 sub-array H0 according to spatial positions of the corresponding 2×2 space domain -matrices in the original N×N array. N/2×N/2 sub-arrays Hx, Hy, and Hd similarly contain respective high-frequency components hx, hy, and hd arranged in the respective N/2×N/2 sub-array Hx, Hy, and Hd according to spatial positions of corresponding 2×2 matrices in the original N×N array.
The H-transform can be applied to low frequency sub-array H0 in the same manner as the H-transform of the original image, and the resulting level-two transform components h0′, hx′, hy′, and hd′ can be arranged in N/4×N/4 sub-arrays H0′, Hx′, Hy′, and Hd′. The process can be repeated one or more times to construct a multi-resolution pyramid data structure such as illustrated in FIG. 2. In encoding, transmitting, or storing a still image, a lossy compression can drop portions of one or more of the lower levels of the pyramid, and the amount of data/compression can be easily changed or adapted according to the available bandwidth or the desired resolution of the image.
Video (or moving image) encoding commonly uses motion estimation/compensation procedures that remove redundant information in frames having different time indices. In a system using wavelet transformations of frames of a video image, performing motion estimation directly on the image data in the wavelet domain would be desirable. However, relatively small movement of an object in video can cause significant changes in wavelet domain data, particularly the high frequency components.
FIG. 3 illustrates an example of an image including an object having a color “d” on a background color “e”. An H-transform of a 2×2 matrix 310 corresponding to a portion of the object provides a low frequency component h0 having value 2d and high frequency component hx that is zero. If the object moves left one pixel in the next frame, an edge of the object is in a 2×2 matrix 320 that has the same relative position in the second frame as matrix 310 has in the first frame. An H-transform of matrix 320 provides a low frequency component h0 having value d+e and high frequency component hx that is e−d. Accordingly, when color e differs significantly from color d, a small movement of an object causes a large change in the wavelet domain data.
The rapid changes in wavelet domain data complicate motion estimation techniques in wavelet domain. Motion estimation can be performed in space domain, but transforming from space domain to wavelet domain for compression and back to space domain for motion estimation requires a series of inverse wavelet transforms corresponding to the levels of wavelet domain pyramid structure. The inverse wavelet transforms increase encoding complexity, making real-time encoding difficult for systems having low processing power. A video encoding system is thus desired that has low processing power requirements and uses efficient wavelet domain compression.