Motion estimation is the problem of identifying and describing the motion in a video sequence from one frame to the next. It is an important component of video codecs, as it greatly reduces the inherent temporal redundancy within video sequences. However, it also accounts for a large proportion of the computational effort. To estimate the motion of pixels between pairs of images block matching algorithms (BMA) are regularly used, a typical example being the Exhaustive Search Algorithm (ESA) often employed by MPEG-II. Many researchers have proposed and developed algorithms to achieve better accuracy, efficiency and robustness. A common approach is to search in a coarse to fine pattern or to employ decimation techniques. However, the saving in computation is often at the expense of accuracy. This problem has been largely overcome by the successive elimination algorithm (SEA) (Lee X., and Zhang Y. Q. “A fast hierarchical motion-compensation scheme for video coding using block feature matching”, IEEE Trans. Circuits Systems Video Technol, vol. 6, no. 6, pp. 627-635 1996). This produces identical results to the ESA with greatly reduced computation. However, block-based motion estimation still remains a significant computational expense and is sensitive to noise. A further disadvantage of a block-based approach is that the motion vectors constitute a significant proportion of the bandwidth, particularly at low bit rates. This is one reason why standard systems such as MPEG II or H263 use larger block sizes.
In typical multimedia video sequences, many image blocks share a common motion, as scenes are often of low complexity. If more than half the pixels in a frame can be regarded as belonging to one object, we define the motion of this object as the dominant motion. This definition places no further restrictions on the dominant object type; it can be a large foreground object, the 5 image background, or even fragmented. A model of the dominant motion represents an efficient motion coding scheme for low complexity applications such as those found in multimedia and has become a focus for research during recent years. For internet video broadcast, a limited motion compensation scheme of this type offers a fidelity enhancement without the overhead of full motion estimation.
The use of a motion model can lead to more accurate computation of motion fields and reduces the problem of motion estimation to that of determining the model parameters. One of the attractions of this approach for video codec applications is that the model parameters use a very small bandwidth compared with that of a full block-based motion field.
Conventional approaches to estimating motion are typically complex and computationally expensive. In one standard approach, for example, least squares techniques are used to estimate parameter values which define average block motion vectors across the image. While such an approach frequently gives good results, it requires more computational effort than is always justified, particularly when applied to low complexity, low bit rate multimedia applications. The approach is also rather sensitive to outliers.