An image sequence, such as a video image sequence, typically includes a sequence of image frames or pictures. The reproduction of video containing moving objects typically requires a frame speed of thirty image frames per second, with each frame possibly containing in excess of a megabyte of information. Consequently, transmitting or storing such image sequences requires a large amount of either transmission bandwidth or storage capacity. To reduce the necessary transmission bandwidth or storage capacity, the frame sequence is compressed such that redundant information within the sequence is not stored or transmitted. Television, video conferencing and CD-ROM archiving are examples of applications which can benefit from efficient video sequence encoding.
Generally, to encode an image sequence, information concerning the motion of objects in a scene from one frame to the next plays an important role in the encoding process. Because of the high redundancy that exists between consecutive frames within most image sequences, substantial data compression can be achieved using a technique known as motion estimation/compensation (also known as motion-compensated interframe predictive video coding), which has been adopted by various international standards, such as ITU H.263, ISO MPEG-1 and MPEG-2.
In brief, the encoder only encodes the differences relative to areas that are shifted with respect to the areas coded. Namely, motion estimation is a process of determining the direction and magnitude of motion (motion vectors) for an area (e.g., a block or macroblock) in the current frame relative to one or more reference frames. Whereas, motion compensation is a process of using the motion vectors to generate a prediction (predicted image) of the current frame. The difference between the current frame and the predicted frame results in a residual signal (error signal), which contains substantially less information than the current frame itself. Thus, a significant saving in coding bits is realized by encoding and transmitting only the residual signal and the corresponding motion vectors.
One popular motion compensation method is the block matching algorithm (BMA), which estimates the displacements on a block-by-block basis. Namely, a current frame is divided into a number of blocks of pixels (referred to hereinafter as the current blocks). For each of these current blocks, a search is performed within a selected search area in the preceding frame for a block of pixels that "best" matches the current block. This search is typically accomplished by repetitively comparing a selected current block to similarly sized blocks of pixels in the selected search area of the preceding frame. Once a block match is found, the location of the matching block in the search area in the previous frame relative to the location of the current block within the current frame defines a motion vector. This approach, i.e., comparing each current block to an entire selected search area, is known as a full search approach or the exhaustive search approach. The determination of motion vectors by the exhaustive search approach is computationally intensive, especially where the search area is particularly large. As such, these systems are relatively slow in processing the frames and may be limited in real-time applications.
Other motion estimation methods incorporate the concept of hierarchical motion estimation (HME), where an image is decomposed into a multiresolution framework, i.e., a pyramid. A hierarchical motion vector search is then performed, where the search proceeds from the lowest resolution to the highest resolution of the pyramid. Although HME has been demonstrated to be a fast and effective motion estimation method, the generation of the pyramid still incurs a significant amount of computational cycles.
Therefore, a need exists in the art for an apparatus and a concomitant method for reducing the computational complexity in determining motion vectors.