One of important element technologies in video encoding represented by the standard H.264 is motion-compensated inter-frame prediction. In order to efficiently encode a motion vector (MV) in motion-compensated inter-frame prediction, predictive encoding of the MV is performed (for example, see Non-Patent Document 1). FIG. 13 is a block diagram illustrating the configuration of a video encoding device using motion compensation according to the related art. In FIG. 13, an encoding unit 300 based on the motion compensation performs encoding based on the motion compensation. A motion estimation unit 310 estimates the motion of an image through a motion search. A MV storage unit 320 stores a MV calculated through the motion estimation.
A MV prediction unit 330 predicts a MV from encoded MV prediction information for MV prediction coding. A reference block MV extraction unit 331 extracts a MV of a reference block for use in prediction of the MV A median calculation unit 332 calculates the median of the MV extracted from the reference block. A prediction residual calculation unit 340 calculates the difference between the MV and a predicted MV (hereinafter referred to as a predicted vector). A code allocation unit 350 outputs an encoded stream by allocating a variable length code to a quantized transform coefficient or a prediction residual signal (referred to as a prediction error vector) of the MV.
When a video signal of the encoding target block is input, the motion estimation unit 310 performs a motion search by matching the input video signal against a decoded signal of an encoded reference image, and calculates a MV The calculated MV is input to the encoding unit 300 based on the motion compensation. In the encoding unit 300 based on the motion compensation, a residual signal between the video signal and the predicted signal is obtained through motion compensation using the MV and encoded by an orthogonal transform, quantization, or the like. A quantized value of a processing result or the like is encoded by the code allocation unit 350 and the encoded quantized value is output as an encoded stream. On the other hand, predictive encoding is also performed to reduce the code bit amount for the MV. Because of this, the MV calculated by the motion estimation unit 310 is stored in a MV storage unit 320 for reference in the future. The MV prediction unit 330 calculates a predicted vector using an encoded MV.
In prediction of the MV in the MV prediction unit 330, first, the reference block MV extraction unit 331 extracts MVs from the MV storage unit 320 by designating encoded blocks in the vicinity of a prediction target block (encoding target block) B0 of an encoding target image (also referred to as an encoding target picture or frame) illustrated in FIG. 14 as reference blocks B1 to B3. FIG. 14 is a diagram illustrating an example of the prediction target block of the encoding target image.
Next, the median calculation unit 332 calculates medians of MV components of the reference blocks B1 to B3, and generates a predicted vector from the calculated medians. A predicted vector generation method is referred to as spatial median prediction. The prediction residual calculation unit 340 calculates a difference (prediction error vector) between the MV and the predicted MV, and transmits the prediction error vector to the code allocation unit 350. The prediction error vector is encoded by the code allocation unit 350 at a variable length, and the encoded prediction error vector is output as an encoded stream.
FIG. 15 is a block diagram illustrating the configuration of a video decoding device using motion compensation of the related art. In FIG. 15, a variable length decoding unit 400 decodes a variable length code of the encoded stream. A MV calculation unit 410 adds a prediction error vector to a predicted vector. A MV storage unit 420 stores the MV. A MV prediction unit 430 predicts the MV using decoded information. A reference block MV extraction unit 431 extracts the MV of the reference block for use in the prediction of the MV. A median calculation unit 432 calculates a median of a MV component extracted from the reference block. A decoding unit 440 based on motion compensation performs the motion compensation using the calculated MV, and outputs a decoded video signal by decoding a decoding target block.
When the encoded stream is input, the variable length decoding unit 400 decodes a variable length code of the encoded stream, transmits a quantized transform coefficient of the decoding target block to the decoding unit 440 based on the motion compensation, and transmits the prediction error vector to the MV calculation unit 410. The MV calculation unit 410 adds the prediction error vector to a predicted vector obtained from the decoded MV, and calculates the MV. The calculated MV is transmitted to the decoding unit 440 based on the motion compensation and stored in the MV storage unit 420. The decoding unit 440 based on the motion compensation performs the motion compensation using the calculated MV, and outputs a decoded video signal by decoding a decoding target block.
A MV prediction process of the MV prediction unit 430 in the video decoding device is substantially the same as the process of the MV prediction unit 330 in the video encoding device illustrated in FIG. 13. FIG. 16 is a block diagram illustrating a configuration of a time direction MV prediction unit of the related art.
In encoding according to the standard H.264, as one of encoding modes in encoding of a B picture, an encoding mode which is referred to as a direct mode in which motion information is predicted and generated from motion information of an encoded block and in which encoding of the motion information is omitted is used. The direct mode includes a spatial direct mode mainly using motion information of a space direction and a temporal direct mode mainly using motion information of a time direction. In prediction of the MV in the temporal direct mode, a MV prediction unit 500 calculates a predicted vector as follows.
An anchor block MV extraction unit 501 extracts a MV mvCol of a block (referred to as anchor block) at the same position as a prediction target block in an anchor picture from a MV storage unit 510. The anchor picture is a picture having a MV when the MV of the direct mode is obtained. Normally, the anchor picture is a rear reference picture closest to the encoding target picture in the order of display. Next, an extrapolation prediction unit 502 calculates a MV mvL0 of L0 and a MV mvL1 of L1 from the MV mvCol through proportional distribution according to time intervals of a reference picture of L0, an encoding target picture, and an anchor picture.
Also, because it is possible to select a maximum of two pictures from an arbitrary reference picture in the B picture, the two pictures are discriminated as L0 and L1, prediction to be mainly used in a forward direction is referred to as L0 prediction, and prediction to be mainly used in a backward direction is referred to as L1 prediction. The MV prediction unit 500 outputs the MVs mvL0 and mvL1 calculated by the extrapolation prediction unit 502 as predicted vectors. In addition, there is a method of designating the MV mvCol as the predicted vector. A predicted vector generation method is referred to as co-located prediction.