The presence of multimedia capabilities on mobile terminals opens up a spectrum of applications, such as video-conferencing, video telephony, security monitoring, information broadcast and other such services. Video compression techniques enable the efficient transmission of digital video signals. Video compression algorithms take advantage of spatial correlation among adjacent pixels to derive a more efficient representation of the important information in a video signal.
The most powerful compression systems not only take advantage of spatial correlation, but can also utilize temporal correlations among adjacent frames to further boost the compression ratio. In such systems, differential encoding is used to transmit only the difference between an actual frame and a prediction of the actual frame. The prediction is based on information derived from a previous frame of the same video sequence.
In motion compensation systems, motion vectors are derived by comparing a portion (i.e., a macroblock) of pixel data from a current frame to similar portions (i.e., search area) of the previous frame. A motion estimator determines the closest match of the reference macroblock in the present image using the pixels in the previous image. The criterion used to evaluate similarity is usually the mean absolute difference between the reference macroblock and the pixels in the search area corresponding to that search position. The use of motion vectors is very effective in reducing the amount of data to be transmitted.
The MPEG-4 simple profile which is intended for wireless video applications is representative of the current level of technology in low-bit rate, error resilient video coding. From the viewpoint of system design, all the proposed techniques have to be implemented in the highly power constrained, battery operated environment. Hence, to prolong battery life, system and algorithm parameters are preferably modified based upon the data being processed.
The source coding model of MPEG-4 simple profile (which is based in the H.263 standard) employs block-based motion compensation for exploiting temporal redundancy and discrete cosine transform for exploiting spatial redundancy. The motion estimation process is computationally intensive and accounts for a large percentage of the total encoding computations. Hence there is a need for developing methods that accurately compute the motion vectors in a computationally efficient manner.
The fixed size block-matching (FSBM) technique for determining the motion vectors is the most computationally intensive technique among all known techniques, but it gives the best results as it evaluates all the possible search positions in the given search region. Techniques based on the unimodal error surface assumption, such as the N-step search and logarithmic search achieve a large fixed magnitude of computational reduction irrespective of the contents being processed. But, the drop in peak signal noise ratio (PSNR) due to local minima problems leads to perceptible difference in visual quality, especially for high activity sequences.
The multi-resolution motion estimation technique of finding the motion vectors is a computationally efficient technique compared to the FSBM algorithm. In this technique, coarse values of motion vectors are obtained by performing the motion vector search on a low-resolution representation of the reference macroblock and the search area. This estimate is progressively refined at higher resolutions by searching within a small area around these coarse motion vectors (also referred to as candidate motion vectors, or CMVs) obtained from the higher level.
The number of candidate motion vectors propagated to the higher resolution images is usually fixed by the algorithm designer to be a single number, irrespective of the sequence or the macroblock characteristics. Each CMV contributes to a determinate number of computations. Hence, by using a prefixed number of CMVs, either the PSNR obtained may be low if a small number of CMVs are propagated or the computational complexity becomes large if many CMVs are propagated. In a power constrained environment, propagating many CMVs would reduce battery life. Hence, fixed solutions for multi-resolution motion estimation either have a high power requirement if PSNR is to be maintained or may result in poor image quality when a fixed low computational complexity technique is used.