Motion estimation (ME) in video coding may be used to improve video compression performance by removing or reducing temporal redundancy among video frames. For encoding an input block, traditional motion estimation may be performed at an encoder within a specified search window in reference frames. This may allow determination of a motion vector that minimizes the sum of absolute differences (SAD) between the input block and a reference block in a reference frame. The motion vector (MV) information can then be transmitted to a decoder for motion compensation. The motion vector can be determined for fractional pixel units, and interpolation filters can be used to calculate fractional pixel values.
Where original input frames are not available at the decoder, ME at the decoder can be performed using the reconstructed reference frames. When encoding a predicted frame (P frame), there may be multiple reference frames in a forward reference buffer. When encoding a bi-predictive frame (B frame), there may be multiple reference frames in the forward reference buffer and at least one reference frame in a backward reference buffer. For B frame encoding, mirror ME or projective ME may be performed to get the MV. For P frame encoding, projective ME may be performed to get the MV.
In other contexts, block-based motion vector may be produced at the video decoder by performing motion estimation on available previously decoded pixels with respect to blocks in one or more frames. The available pixels could be, for example, spatially neighboring blocks in the sequential scan coding order of the current frame, blocks in a previously decoded frame, or blocks in a downsampled frame in a lower layer when layered coding has been used. The available pixels can alternatively be a combination of the above-mentioned blocks.