Most of the video and image compression standards in commercial use, for example, MPEG-2, MPEG-4, H.264 and JPEG, are based on block motion estimation and transformation. On the encoding side, motion estimation is made on previously encoded and reconstructed reference pictures; the residual between the pixels on the current picture and the prediction from motion estimation is transformed; the coefficients will be quantized, converted into 1-dimension through a process called zigzag; further compressed with variable length coding (VLC); and then wrapped up into bit stream based on the specific syntax of the standard.
On the decoding side, inverse transform is an essential part of the decompression procedure, which is usually very time consuming as well. Reducing the computation required to decompress video is not only one of the main interests in the academic community, but the industry has also put a lot of effort as well in order to enable commercial applications derived from these video standards.
Generally, fast algorithms are developed and used to reduce the computation required in both the forward transform and inverse transform. These fast algorithms are usually derived by taking advantage of the symmetrical properties existing in the transform matrix elements and the linearity of the transforms. It does not assume any other properties in the input data, so the fast algorithms for forward transform and inverse transform are usually very similar. Typical examples of these kind of fast algorithms include FFT (Fast Fourier Transform), various fast transforms developed in DCT (Discrete Cosine Transform), etc.
While these kind of fast algorithms save a lot of computations when compared to some brute-force methods, when applied to inverse transform, they fail to take advantage of other important properties existing in the input data (i.e. transform coefficients) which can be used to further reduce the number of computations. Among the transform coefficients taken as input data in inverse transform, many of them become zero after quantization, another essential step to achieve video/image data compression on the encoding side. For example, it is quite common in the H.264 based video stream, that 4×4 block will only have 1 or 2 non-zero coefficients, while the generic fast algorithm blindly assumes all 16 coefficients are non-zero.
The number zero possess the much desired elegance that when it is multiplied to another number the product is zero, and the sum is just the other operand when it is added to another number. If these simple properties of the number zero are ingeniously taken advantage of, computation can be further reduced. Reduction of computations will lead to other advantages, for example, handheld electronic devices derived from video decompression can run at lower frequency so as to save power.