1. Field
This disclosure relates generally to signal processing, and more specifically but not exclusively, to digital video encoding technologies.
2. Description
Image information (such as digital video information) is often transmitted from one electronic device to another. Such information is typically encoded and/or compressed to reduce the bandwidth required for transmission and/or to decrease the time necessary for transmission. In some configurations, information about differences between a current picture and a previous picture might be transmitted and the device receiving the image information may then, for example, decode and/or decompress the information (e.g., by using the previous picture and the differences to generate the current picture) and provide the image to a viewing device.
One of the key elements of many video encoding/compression schemes is motion estimation. A video sequence consists of a series of frames. The motion estimation technique exploits the temporal redundancy between adjacent frames to achieve compression by selecting a frame as a reference and predicting subsequent frames from the reference. The process of motion estimation based video compression is also known as inter-frame coding. Motion estimation is used with an assumption that the objects in the scene have only translational motion. This assumption holds as long as there is no camera pan, zoom, changes in luminance, or rotational motion. However, for scene changes, inter-frame coding does not work well, because the temporal correlation between frames from different scenes is low. In this case, a second compression technique—intra-frame coding—is used.
Using the motion estimation technique, the current frame in a sequence of frames is predicted from at least one reference frame. The current frame is divided into N×N pixel macroblocks, typically 16×16 pixels in size. Each macroblock is compared to a region in the reference frame of the same size using some error measure, and the best matching region is selected. The search is conducted over a predetermined search area. A motion vector denoting the displacement of the region in the reference frame with respect to the macroblock in the current frame is determined. When a previous frame is used as a reference, the prediction is referred to as forward prediction. If the reference frame is a future frame, then the prediction is referred to as backward prediction. When both backward and forward predictions are used, the prediction is referred to as bidirectional prediction.
To reduce computational overhead of macroblock search, a search window within the reference frame is often identified and the macroblock is compared to various positions within the search window. The most effective yet computationally intensive way of comparing the macroblock to the search window is to compare the pixels of the macroblock to the pixels of the search window at every position that the macroblock may be moved to within the search window. This is referred to as a “full” or “exhaustive” search. For each position of the block tested within the search window, each pixel of the block is compared to a corresponding pixel in the search window. The comparison comprises computing a deviation between the values of compared pixels.
Often the mathematical sum of absolute differences (SAD), mean squared error (MSE), mean absolute error (MSE), or mean absolute difference (MAD) functions are utilized to quantitatively compare the pixels. The deviations for each macroblock position are then accumulated, and the position within the search window that yields the smallest deviation is selected as the most likely position of the block in the previous frame. The differences in the current and previous positions of the block are then utilized to derive the motion vector to estimate the movement associated with the block between the reference frame and the current frame. The motion vector may then, for example, be transmitted as image information (e.g., instead of a full image frame) so that a decoder may render, recreate, or build the current frame by simply applying the motion vector information to the reference frame.
There are various video encoding standards. The most common ones are the Moving Pictures Expert Group (MPEG) Release Two (MPEG-2) standard published by the International Standards Organization (ISO) and the International Electrotechnical Commission (IEC), the MPEG-4 part 10 standard (also known as Advanced Video Coding (AVC) standard) published by ISO/IEC, and the Society of Motion Picture and Television Engineers or SMPTE 421M Video Codec standard (also known as VC-1 standard). Although different standards share similar algorithmic ideas and require similar motion estimation mechanisms, the actual details are often very distinctive. Motion estimation in general requires intensive computation and is desirably performed by hardware. Since motion estimation used by different video encoding standards has its own distinctive features, each hardware implementation of motion estimation needs to be standard specific resulting in inefficient use of the hardware. Therefore, it is desirable to have a unified motion estimation hardware device which covers special constraints of various video standards.