Several digital video encoding standards have been developed during the past years, but the most important for the present and foreseeable future are:                MPEG-2 for television-like resolutions and high bitrates (greater than 1.5 Mbits/s) for digital video cameras, DVD recordable applications        MPEG-4 or H263 for video telephony (especially for wireless mobile terminals) for lower resolutions (e.g., QCIF—176 by 144 pixels) and lower bit rates (less then 1 Mbits/s)        
While the following explanation will be provided by primarily referring to MPEG-2, the same points apply in principle to the other standards listed as it can be gathered, e.g., from the ISO/IEC 13 818-2 MPEG-2 and ISO/IEC 14 469-2 MPEG-4 video coding standards.
The encoding process is based on several tasks in cascade, of which motion estimation is by far the most expensive computationally. The standard defines the output of the estimation block (a motion vector and the prediction error), but leaves freedom on how this estimation is done, so that encoder providers can use a preferred estimation technique and implementation to add value to their box (lower cost, higher picture quality). After motion estimation a set of decisions have to be taken on how one wants to encode each MB (MacroBlock, the “quantum” or basic building block in which is decomposed every picture for motion estimation). Also one must provide the predictor itself (i.e., the macroblock that the estimation process has found to be best matching to the one currently under process) to the rest of the encoder chain.
All these operations require so much computational power that it is impractical to implement them even on very high performance CPUs/DSPs without heavily compromising on overall picture quality of the encoded bitstream. On the other hand, to be able to support different standards and to be able to tweak the motion estimation algorithm, means are required adapted to be programmed or even re-programmed on the field, for example by downloading off-the-air the new version of the algorithm on the terminal. The motion estimation algorithm is not fixed by the standards and it is crucial to give a performance competitive advantage to the overall encoder. So a better version of the motion estimation algorithm can result in increased perceived performance of the overall encoder.
Another key aspect of the motion estimation task is its memory bandwidth requirement. As an extensive search for the best match must be performed within very large search windows, all the algorithms tend to eat up a large amount of system memory bandwidth. Typical bandwidth (B/W) figures for this task are in excess of 100 MB/s. This has two main drawbacks: expensive high-speed and/or wide-wordlength memory devices are required and power consumption is increased, as higher external I/O activity means more power wasted on the device's heavily (capacitive) loaded external pins.
These reasons lead to the need for a motion estimator algorithm that has a low cost (low computational complexity) yet a high performance in terms of picture subjective quality and for a motion estimation engine that is equally cost effective (low area), flexible (SW programmable), low bandwidth and low power, as most of the applications target battery-powered mobile terminals (cameras, cellular phones).
Examples of prior attempts by others are described in the following documents, e.g., EP-A-0 895 423, EP-A-0 895 426, EP-A-0 893 924, EP-A-0 831 642, U.S. Pat. No. 5,936,672 and U.S. Pat. No. 5,987,178.
Once the key characteristics of a motion estimator engine are identified, architectural solutions that can achieve those goals must be found. The required features are low-cost (i.e., low area), low bandwidth, low power, high flexibility.