In a traditional video coding system, motion estimation (ME) may be performed at an encoder to get motion vectors for the prediction of motion for a current encoding block. The motion vectors may then be encoded into a binary stream and transmitted to the decoder. This allows the decoder to perform motion compensation for the current decoding block. In some advanced video coding standards, e.g., H.264/AVC, a macroblock (MB) can be partitioned into smaller blocks for encoding, and a motion vector can be assigned to each sub-partitioned block. As a result, if the MB is partitioned into 4×4 blocks, there may be up to 16 motion vectors for a predictive coding MB and up to 32 motion vectors for a bi-predictive coding MB, which may represent significant overhead. Considering that the motion coding blocks have strong temporal and spatial correlations, motion estimation may be performed based on reconstructed reference pictures or reconstructed spatially neighboring blocks at the decoder side. This may let the decoder derive the motion vectors itself for the current block, instead of receiving motion vectors from the encoder. This decoder-side motion vector derivation (DMVD) method may increase the computational complexity of the decoder, but it can improve the efficiency of an existing video codec system by saving bandwidth.
On the decoder side, if a block is encoded using a DMVD method, its motion vector can only be available after performing the decoder side motion estimation. This may affect a parallel decoding implementation in the following two respects. First, if the decoder side motion estimation uses spatially neighboring reconstructed pixels, the decoding of a DMVD block can only be started after its all neighboring blocks (which contain the pixels used in the motion estimation) have been decoded. Second, if one block is encoded in DMVD mode, its motion vectors may be used for the motion vector prediction of its neighboring blocks. So the decoding process of its neighboring blocks, which use the motion vectors of this current DMVD coded block for motion vector prediction, can only be started after the motion estimation of current DMVD block has finished. Therefore, there are dependencies in the above processing, where these dependencies may slow decoding. In particular, the processing at the decoder side may be less amenable to parallel DMVD algorithms.
In addition, motion estimation at the decoder side may, in some implementations, require a search among possible motion vector candidates in a search window. The search may be an exhaustive search or may rely on any of several known fast search algorithms. Even if a relatively fast search algorithm is used, a considerable number of candidates may have to be evaluated before the best candidate may be found. This too represents an inefficiency in processing at the decoder side.