The invention relates generally to encoding image information and more specifically to motion estimation.
Image information, for example, motion picture information, video information, animated graphics information, etc., often involves large amounts of data. Image information often includes data to represent coloring, shading, texturing, and transparency of every pixel in every frame of the image information. Each frame may include hundreds of thousands or even millions of pixels. The frames are often presented in rapid succession, for example, at a rate of 30 frames per second, to convey a sense of motion. Thus, without some form of data compression, such image information could easily involve hundreds of millions of bits per second.
To avoid the need to store and transmit such large amounts of data, data compression techniques have been developed. Some data compression techniques have been designed to compress image information. One example is the type of encoding specified by the Moving Picture Experts Group (MPEG). MPEG encoding produces a stream of different types of frames. These frames include intra frames and non-intra frames. The intra frames include sufficient information to reconstruct a frame of unencoded image information without the need to reference other frames of encoded data. The non-intra frames, however, provide information that refers to other information encoded in intra or other non-intra frames, which are called reference frames. Unencoded frames represented by the non-intra frames may be reconstructed by applying the information contained in the non-intra frames to the information contained in the intra or other non-intra frames to which the non-intra frames refer.
Since the amount of information stored in non-intra frames is much smaller than the amount of information in the unencoded frames that the non-intra frames represent, the use of non-intra frames can help greatly reduce the amount of image information that needs to be stored or transmitted. One aspect of the non-intra frames that allows them to contain less information than the unencoded frames they represent is that the non-intra frames essentially recycle image information found in the reference frames. For example, an unencoded frame represented by an intra frame may depict several objects. The objects may be located in several areas of the frame. Since the advantage of moving images, such as motion pictures, video, and animated graphics, over still images is the ability of the moving images to convey a sense of motion, the objects located in several areas of the frame often move to different areas when they appear in subsequent frames.
Since the image information needed to represent the appearance of the objects is present in the reference frames, that information may be recycled in non-intra frames. The non-intra frames contain the information needed to update the location of the objects without having to contain all of the information needed to express the appearance of the objects. Therefore, to encode non-intra frames, the change in the position of the objects represented in the non-intra frames must be determined relative to the position of the objects represented in the reference frames. This determination is referred to as motion estimation.
One technique that has been used for motion estimation involves dividing an image into image blocks (i.e., square blocks of pixels within the image). Then a determination is made as to where the block of pixels was located in the previous frame. This process is done by matching the pattern one block at a time.
In greater detail, the process includes several steps. First, the current (non-intra, or predicted) frame is divided into blocks. Then, for each block, a collection of potentially matching blocks is defined in the reference (intra or predicted) frame. Then, all of the pixel differences are added up using absolute differences to determine a score. The matching block that has the best score is then selected.
The process may be implemented as a hierarchical search. In a hierarchical search, once a matching block is selected, the process is repeated at a finer resolution until a matching block of the desired resolution is identified.
There are several drawbacks associated with such prior art techniques. For example, the step of defining a collection of potentially matching block in the reference frame often requires the examination of a very large number of potentially matching blocks. Also, the step of adding up the pixel differences to determine a score for each block requires a large number of operations. For example, for a 16xc3x9716 pixel block, 256 absolute differences had to be added. Since these operations must be repeated for each potentially matching block, the overall process is cumbersome and inefficient. Thus, an efficient technique is needed to provide motion estimation for encoding sequential frames.