Subband encoding is a method of dividing the frequency of an image signal and encoding a signal (subband signal) of each frequency band. Unlike block-based orthogonal transform such as discrete cosine transform, subband encoding has the characteristics that no block distortion occurs in principle, and hierarchical encoding can be easily realized by recurrently dividing low-frequency components. Subband encoding using wavelet transform in JPEG 2000 as an international standard encoding method is used for still pictures.
When subband encoding is applied to moving picture encoding, not only a correlation in a spatial direction but also a correlation in a temporal direction of a signal must be taken into consideration. Subband moving picture encoding is roughly classified into two methods: a method in which subband encoding is performed for each frame after a correlation in the temporal direction is removed by performing motion compensation on the original image in a spatial region, and a method in which this correlation in the temporal direction is removed by performing motion compensation for each subband region after the original image is divided into subbands.
FIG. 25 is a flowchart showing the flow of a conventional coding process (non-patent reference 1: J.-R. Ohm, “Three-dimensional subband coding with motion compensation”, IEEE Trans, Image Processing, vol. 3, pp. 559-571, September 1999) which performs motion compensation in a spatial region. A process of encoding a set A(0)[i] (0≦i<n, n is the power of 2) of consecutive frames will be explained below with reference to FIG. 25. First, two consecutive frames A(0)[i] and A(0)[i+1] are subband divided in the temporal direction by setting j=1 and i=0, 2, . . . , n−2 (steps 201 and 202), thereby obtaining A(1)[i] in a low-frequency band and E[i+1] in a high-frequency band (steps 203, 204, and 205). Then, consecutive low-frequency-band signals A(1)[i<<1] and A(1)[(i+1)<<1] are subband divided in the temporal direction by setting j=1 (step S206), thereby obtaining A(2)[i<<1] in a low-frequency band and E[(i+1)<<1] in a high-frequency band (steps 203, 204, and 205). This processing is repeated until frames except for the first frame are encoded as high-frequency-band signals, i.e., until (1<<j) becomes n (step 207). After that, A(j)[0] and E[i] (0<j<n) are subband divided in the spatial direction and encoded (step 208). In the temporal-direction subband division between two frames, a high-frequency-band signal is equivalent to an error signal of motion compensation prediction, and a low-frequency-band signal is equivalent to an average signal of two motion compensated frames.
In a decoding process, the flow of the above process is traced in the opposite direction, i.e., subband signals are combined in the spatial direction for each frame, and subband combination is performed in the temporal direction in accordance with the frame reference relationship. In the subband signal combination performed frame by frame, a reduced image signal is obtained by stopping the combination without using any high-frequency-component subband. In three-dimensional wavelet coding, a decoded image on a reduced resolution can be obtained by performing temporal-direction subband combination on signals of each frame obtained by partial subband combination. However, when motion compensation in temporal-direction subband division is performed for each small number of pixels, an interpolation process is used in predictive image generation, but this interpolation process is not commutative with subband division. That is, a signal which is subband divided in the spatial direction after being subband divided in the temporal direction is not equal to a signal which is subband divided in the temporal direction after being subband divided in the spatial direction, so a decoded image on the reduced resolution deteriorates much more than a signal obtained by reducing the original signal.
FIG. 26 is a flowchart showing the flow of a conventional coding process (non-patent reference 2: H. Gharavi, “Subband Coding Algorithm for Video Applications: Videophone to HDTV Conferencing”, IEEE Trans., CAS for Video Technology, Vol. 1, No. 2, pp. 174-182, June 1991) which performs motion compensation in a subband region. A process of encoding a set A[k] (0≦k<n) of consecutive frames will be explained below with reference to FIG. 26. First, each frame is subband divided (step 301). After that, motion compensation prediction is performed for each subband of a frame A[i] (1≦i<n) and its reference frame A[i−1] (steps 302, 303, 304, and 305). Quantization and lossless encoding are then performed on the obtained prediction error signal of the frame A[i] (1≦i<n) and on a frame A[0] (step 306). A decoding process is performed by tracing the above process in the opposite direction, i.e., subband coefficients of the prediction error signal of the frame A[i] (1≦i<n) and the frame A[0] are obtained by performing inverse transforms of the lossless encoding and quantization, and a subband coefficient of the frame A[i] (1≦i<n) is obtained by performing motion compensation for each subband. After that, a decoded image is obtained by subband combining the individual frames. A reduced decoded image signal is obtained by using no high-frequency-component subbands in this subband combination. Unlike the first conventional coding process which performs motion compensation in a spatial region, no large deterioration except quantization and transform errors is found between the decoded image on the reduced resolution and the reduced signal of the original signal. However, the prediction efficiency largely decreases in motion compensation in a high-frequency band mainly containing edge components, when compared to motion compensation in a spatial region. That is, the second conventional coding method which performs motion compensation in a subband region has the problem that the coding efficiency is lower than that of the first conventional coding method.    Non-patent Reference 1: J.-R. Ohm, “Three-dimensional subband coding with motion compensation”, IEEE Trans, Image Processing, vol. 3, pp. 559-571, September 1999    Non-patent Reference 2: H. Gharavi, “Subband Coding Algorithm for Video Applications: Videophone to HDTV Conferencing”, IEEE Trans., CAS for Video Technology, Vol. 1, No. 2, pp. 174-182, June 1991    Non-patent Reference 3: A. Secker et. al, “Motion-compensated highly scalable video compression using an adaptive 3D wavelet transform based on lifting”, IEEE Trans. Int. Conf. Image Proc., pp 1029-1032, October, 2001    Non-patent Reference 4: Lio et. at., “Motion Compensated Lifting Wavelet And Its Application in Video Coding”, IEEE Int. Conf. Multimedia & Expo 2001, August, 2001    Non-patent Reference 5: J. M. Shapiro, “Embedded image coding using zerotrees of wavelets coefficients”, IEEE Trans. Signal Processing, vol. 41, pp. 3445-3462, December 1993