The present invention relates generally to video compression, and more particularly, to wavelet based coding utilizing multiple reference frames for motion compensated temporal filtering.
A number of the current video coding algorithms are based on motion compensated predictive coding, which are considered hybrid schemes. In such hybrid schemes, temporal redundancy is reduced using motion compensation, while spatial redundancy is reduced by transform coding the residue of motion compensation. Commonly used transforms include the discrete cosine transform (DCT) or sub-band/wavelet decompositions. Such schemes, however, lack flexibility in terms of providing true scalable bit streams.
Another type of scheme known as 3D sub-band/wavelet (hereafter “3D wavelet”) based coding has gained popularity especially in the current scenario of video transmission over heterogeneous networks. These schemes are desirable in such application since very flexible scalable bit streams and higher error resilience is provided. In 3D wavelet coding, the whole frame is transformed at a time instead of block by block as in DCT based coding.
One component of 3D wavelet schemes is motion compensated temporal filtering (MCTF), which is performed to reduce temporal redundancy. An example of MCTF is described in an article entitled “Motion-Compensated 3-D Subband Coding of Video”, IEEE Transactions On Image Processing, Volume 8, No. 2, February 1999, by Seung-Jong Choi and John Woods, hereafter referred to as “Woods”.
In Woods, frames are filtered temporally in the direction of motion before the spatial decomposition is performed. During the temporal filtering, some pixels are either not referenced or are referenced multiple times due to the nature of the motion in the scene and the covering/uncovering of objects. Such pixels are known as unconnected pixels and require special handling, which leads to reduced coding efficiency. An example of unconnected and connected pixels is shown in FIG. 1, which was taken from Woods.