Motion estimation is the process of determining motion vectors that describe the transformation from a first frame to a second frame in a video sequence. Motion compensation is the process of applying the motion vectors to the first frame to synthesize the transformation to the second frame. The combination of motion estimation and motion compensation forms a critical component of video compression as used by MPEG as well as many other video codes. Each frame in a typical video sequence is made up of some changed regions of another frame. By exploiting strong interframe correlation along the temporal dimensions, motion estimation thus provides means for reducing temporal redundancy and achieving video compression.
Motion vectors may relate to the whole image or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel. Motion vectors may be represented by a translational model or many other models that approximate the motion of a real video camera, such as rotation, translation, or zoom. There are various methods for finding motion vectors. One of the popular methods is a block-matching algorithm (BMA), which finds a matching block from one frame in another frame. Different searching strategies such as cross search, full search, spiral search, or three-step search may be utilized in BMA to evaluate possible candidate motion vectors over a predetermined neighborhood search window to find the optimum motion vector.
FIG. 1 (Prior Art) illustrates a motion estimation technique using block-matching algorithm. In the example of FIG. 1, a video sequence comprises a current image frame 12 and a reference image frame 14. Each of the image frames comprises a plurality of image blocks. Block-matching algorithm is used to find a reference block (matching block) of a current image block (search block) 16 from current frame 12 in reference frame 14. Block-matching algorithms make use of certain evaluation metrics such as mean square error (MSE), sum of absolute difference (SAD), sum of square difference (SSD), etc. to determine whether a given block in reference frame 14 matches search block 16 in current frame 12. As illustrated in FIG. 1, a reference image block 20 is found to be a matching block by applying motion vector 18 with integer-pixel accuracy or sub-pixel accuracy.
Theoretical and experimental analyses have established that sub-pixel accuracy has a significant impact on the performance of motion compensation. Sub-pixel accuracy mainly can be achieved through interpolation. Various methods of performing interpolative up sampling at spatial domain or frequency domain have been proposed over the years. One major concern of implementing interpolative sub-pixel methods, however, is the computation cost. For example, to achieve one-eighth pixel accuracy, an image-processing system needs to handle the storage and manipulation of data arrays that are 64 times larger than integer-pixel motion estimation.
Motion estimation is also commonly used in image registration process, which finds a variety of applications in computer vision such as image matching, pattern recognition, and motion analysis. The Lucas-Kanade algorithm has been proven to be a highly accurate image registration method and has been used in computer vision and medical imaging industry for years. One major concern of applying the Lucas-Kanade algorithm to block-based motion estimation, however, is its high computational complexity. In addition, the Lucas-Kanade algorithm suffers matching deficiency when the starting point of the search is far away from the optimum. It is therefore desirable to have a motion estimation method that reduces the computation cost and complexity while maintaining high sub-pixel accuracy.