Applying a linear transform in temporal direction of a video sequence may not yield high compression efficiency if significant motion is prevalent. A linear transform along motion trajectories seems more suitable but requires a motion-adaptive transform for the input pictures.
For wavelet transforms, this adaptivity can be achieved by constructing the kernel with the so called lifting scheme: A two-channel decomposition is realized by a sequence of prediction and update steps that form a ladder structure.
Adaptivity is permitted by incorporating motion compensation into prediction and update steps as proposed in the U.S. Pat. No. 6'381'276 and the corresponding academic publication “Three-dimensional lifting schemes for motion compensated video compression”, in “Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Salt Lake City, Utah, May 2001, vol. 3, pp. 1793-1796. The fact that the lifting structure is able to map integers to integers without requiring invertible lifting steps makes this approach feasible.
The theoretical investigation in M. Flierl and B. Girod, “Investigation of motion-compensated lifted wavelet transforms”, in Proceedings of the Picture Coding Symposium, Saint-Malo, France, April 2003, pp. 59-62, models a motion-compensated subband coding scheme for a group of K pictures with a signal model for K motion-compensated pictures that are decorrelated by a linear transform. The Karhunen-Loeve Transform is utilized to obtain theoretical performance bounds at high bit-rates. A comparison to both optimum intra-frame coding of the input pictures and motion-compensated predictive coding is given.
Further, it is shown that the motion-compensated subband coding scheme can achieve bit-rate savings of up to 1 bit per sample and motion-accuracy step when compared to optimum intra-frame coding. Note that a motion-accuracy step corresponds to an improvement from, e.g., integer-pel to half-pel accuracy or half-pel to quarter-pel accuracy. Moreover, the above mentioned document “Investigation of motion-compensated lifted wavelet transforms”, demonstrates that this scheme can outperform predictive coding with motion compensation by at most 0.5 bits.
Note that predictive coding fails for statistically independent signal components. In the worst case, the prediction error variance is two times the signal variance which corresponds to a degradation of 0.5 bits per sample when assuming Gaussian signals.
It is known that the efficiency of motion-compensated prediction can be improved by utilizing superimposed motion-compensated signals as employed in MPEG's B-pictures. Prediction with linear combinations of motion-compensated, signals is also called multihypothesis motion-compensated prediction. B-pictures and overlapped block motion compensation are well known examples.
The advantage of averaging multiple motion-compensated signals roots in the suppression of statistically independent noise components and, consequently, the improvement in prediction efficiency.
The document: M. Flierl and B. Girod, “Multihypothesis motion estimation for video coding”, in Proceedings of the Data Compression Conference}, Snowbird, Utah, March 2001, pp. 341-350, investigates superimposed prediction with complementary motion-compensated signals.
The multiple motion-compensated signals with their associated displacement errors are chosen such that the superposition of the motion-compensated signals minimizes the degradation of the prediction signal due to the displacement errors and, consequently, improves prediction performance. Motion-compensated signals chosen according to this criterion are called complementary.
The investigation shows that already two complementary motion-compensated signals provide a large portion of the theoretically possible gain obtained with a very large number of complementary signals. In addition, the superposition of complementary motion-compensated signals benefits also from the suppression of statistically independent noise components.
It is observed that complementary motion-compensated signals achieve bit-rate savings of up to 2 bits per sample and motion-accuracy step when compared to optimum intra-frame coding. Note that the bit-rate savings for single-hypothesis motion-compensated prediction are limited to 1 bit per sample and motion-accuracy step.