Video is typically represented by sequences of two-dimensional image frames or fields. In providing the information representing the sequence of images, however, a bandwidth issue may exist because of the amount of data that needs to be transmitted. In order to provide video information that accommodates potential bandwidth issues, video compression techniques are needed. In a typical compression technique, frames are removed from the sequence of images prior to providing the information, and then, prior to when the images are to be displayed, the images that have been removed need to be reconstructed. One method of reconstructing images is through the use of motion estimation. That is, image frames are constructed based on the estimated motion of objects displayed by the available image frames. More generally, motion estimation can be used for a variety of other video signal processing purposes as well.
Different techniques have been developed for the purposes of estimating motion using available image frames. One such technique of motion estimation employs a block matching technique. In block matching methods, an image is subdivided into square or rectangular blocks having constant sizes or having a prescribed plurality of picture elements, for example, 16×16 or 8×8 picture elements per block. Motion vectors representing the estimated motion from a previous search image are typically determined for each of the blocks of a reference image. In simple applications of the block matching method, the same motion vector for an entire block is attributed to each of the picture elements in the block. Generally, in determining the motion vector for each block of the reference image, a range of possible motion vectors is determined for a respective block using a least difference calculation. From the determined ranges, the motion vector to a block in the search image frame having the least calculated difference from a block in the present reference image is accepted as the motion vector for the block in the reference image frame. Image frames can the be constructed using the motion vectors to fill in those images that were removed during video compression.
In addition to the basic block matching technique previously described, refinements to the basic process have been developed to provide more accurate motion estimations. For example, one refinement provides an iterative block matching technique where the matching process is repeated for blocks of decreasing size. Following the determination of motion vectors for blocks of a first size, the blocks are then sub-divided into smaller child blocks. The child blocks then inherit the motion vector of the parent block in which they are located as a starting point from which a more precise motion vector for each of the child blocks may be calculated. In another refinement, block matching is based on hierarchical processing with different block sizes, starting with coarse resolution blocks at a higher level of the pyramid, and advancing through finer resolution layers for at each layer of the pyramid. A predicted motion vector that is calculated for each block is used to point to a search region in which a matching search block is likely to be located. However, as the resolution gets finer, and the effective block size gets smaller, there are objects for which the parent motion vectors no longer apply and new motion vectors need to be found.
In another refinement of the block matching technique, a quad tree-like sub-division process is used to refine the motion vector field from a coarse to a fine resolution. In this process, the motion vector for the parent block of the reference frame are initially determined. Each parent block is then sub-divided into child blocks, which inherit the motion vector of its parent block. A refinement process is performed where a new motion vector may be attributed to a child block if a local block search around the vicinity of the block to which the inherited motion vector points yields a better matching search block. Thus, for each of the child blocks, a motion vector search is performed. However, the search is restricted to a set of candidate motion vectors consisting of the motion vectors associated with the parent and child blocks adjacent to the child block for which a motion vector is being determined. Although this block matching technique maintains smoothness of the motion field, it may be the case that the true motion vector is not found in the immediate surrounding neighborhood. That is, the candidates from which a motion vector is searched may be too constrained for accurate motion estimation for multiple non-planar motion surfaces, such as for scrolling text credits overlay, and multiple moving foreground and background objects in video scenes. Simply extending the neighborhood increases computational costs as well as the possibility of false matches.
Therefore, there is a need for a system and method that facilitates motion estimation that can escape typical neighborhood constraints in determining motion vectors.