Motion compensated temporal filtering (MCTF), which is a technology for reducing temporal redundancy using a wavelet transform, is commonly addressed in a number is of documents suggested in scalable video coding in part 13 of MPEG-21. Conventional wavelet transforms have mostly been used to decompose high-frequency components and low-frequency components in vertical and horizontal directions in a spatial domain.
However, in MCTF, a video sequence is decomposed in a temporal domain through motion estimation. Recently, MCTF has improved due to bi-directional motion estimation and video decomposition using multiple reference videos.
To decompose a video sequence into high-frequency frames and low-frequency frames more efficiently, a lifting-based wavelet transform may be used in MCTF. The lifting-based wavelet transform includes a prediction step and an update step. In a lifting framework using a 5/3 bi-orthogonal filter, in the prediction step, bi-directional motion compensation is performed on odd-indexed frames based on even-indexed frames. Then, high-frequency frames are generated using differential frames from which as much energy as possible is removed. In the update step, each of the differential frames is added to the even-indexed frames. Thus, the even-indexed frames are updated, and low-frequency frames are generated.
When sk (X) is a video signal having a spatial coordinate of X=(x,y)T and a temp oral coordinate of k, motion-compensated lifting steps may be defined ashk[X]=f2k+1[X]−P(f2k+1[X])lk[X]=f2k[X]+U(f2k[X])>>1,  (1)where hk[X] and lk[X] denote a high-frequency frame and a low-frequency frame, respectively.
When the 5/3 bi-orthogonal filter is used, a predictive operator P and an update operator U in Equation 1 may be defined asP(f2k+1[X])=w0f2k[X+mP0]+w1f2k+2[X+mP1]U(f2k[X])=w0hk-1[X+mU0]w1hk[X+mU1],  (2)where mPX denotes a predictive motion vector for a list X (here, X has a value of 0 or 1, wherein 0 indicates a previous reference frame and 1 indicates a next reference frame) and mUX denotes an update motion vector for the list X. In addition, wo and w1 are weights used in bi-directional motion estimation/compensation.
In such a lift-based wavelet transform, noise and aliasing caused by low-pass-filtering an input frame sequence along motion trajectories, can be reduced in regions where a motion vector was accurately estimated, during the update step. However, in regions where a motion vector was not accurately estimated, a low-pass-filtered frame may have serious visual artifacts such as ghosting. In other words, a reconstructed video sequence formed of selected low-pass-filtered frames, i.e., low-frequency frames, at a reduced temporal resolution, i.e., a low frame rate has video quality deteriorated due to visual artifacts. To reduce visual artifacts, various adaptive update schemes that introduce a weight function in the update step have been proposed, including ‘Response of ce le: Adaptive Update Step in MCTF” by G. Baud, J. Reichel, F. Ziliani, and D. Santa Cruz (ISO/IEC JTC1/SC29/WG11MPEG M10987, Redmond, July 2004) and “Response of ce le in SVC: Content-Adaptive Update Based on Human Vision System” by L. Song, J. Xu, H. Xiong, and F. Wu (ISO/IEC JTC1/SC29/WG11MPEG M11127, Redmond, July 2004).