In block-based video compression, motion-compensated prediction is typically performed as follows. A motion vector is transmitted that represents a displacement between a block in the current frame and a corresponding block in a previously decoded and reconstructed reference frame. A predicted block is generated for the current block based on the displaced block in the reference frame. A residual block is decoded based on transmitted residual information. A reconstructed block is generated by adding the residual block to the predicted block. In the case of bi-prediction, two motion vectors and two corresponding reference blocks are combined to generate the predicted block through sample-by-sample averaging or alternatively, using weight factors that are different from 0.5.
At the decoder side, motion vectors can be determined in multiple ways. One way involves transmission of a motion vector difference relative to a motion vector predictor that is known both in the encoder and in the decoder. Another way involves transmission of an index that selects between a set of candidate vectors, typically from neighbor blocks in the same picture or from a collocated block in a previously transmitted picture (e.g. skip mode in H.264 and use of the merge candidate list in H.265). Still another way is direct mode in H.264 B frames (or temporal merge candidates in H.265), in which the motion vectors used for bi-prediction are derived from a collocated block in a previously transmitted picture. Which method to use for a particular block is typically chosen by the encoder and signaled to the decoder as side information.