1. Field of the Invention
The present invention generally relates to a moving picture encoding and decoding technology, and more particularly, to a moving picture encoding and decoding technology utilizing motion compensation prediction.
2. Description of the Related Art
As a typical moving picture compression-coding method, there is MPEG-4 AVC/H.264 standard. The MPEG-4 AVC/H.264 utilizes motion compensation, where a picture is partitioned into a plurality of rectangular blocks, a picture that has been already encoded or decoded is used as a reference picture, and a movement from the reference picture is predicted. A method that predicts movement by this motion compensation is referred to as inter prediction or motion compensation prediction. According to inter prediction of MPEG-4AVC/H.264, a plurality of pictures can be used as reference pictures, and a most appropriate reference picture is chosen from the plurality of reference pictures for each block so as to perform motion compensation prediction. Therefore, a reference index is assigned to each reference picture, and a reference picture is specified by the reference index. For a B picture, up to two pictures can be selected from decoded reference pictures and be used for inter prediction. Two types of prediction based respectively on these two reference pictures are distinguished as L0 prediction (list 0 prediction), which is mainly used as prediction for a previous picture, and L1 prediction (list 0 prediction), which is mainly used as prediction for a subsequent picture, respectively.
In addition, bi-prediction, which uses two types of inter prediction (i.e., L0 prediction and L1 prediction), is also defined. In case of bi-prediction, prediction for both directions is performed, inter-predicted signals of L0 prediction and L1 prediction respectively are multiplied by a weighting coefficient, an offset value is added and convolution is performed so as to generate an ultimate inter prediction picture signal. The weighting coefficient used for weighted prediction and the offset value are defined by a typical value for each reference picture of each list on a picture by picture basis, and encoded. Coding information relating to inter prediction includes: a prediction mode that differentiate among L0 prediction, L1 prediction, and bi-prediction for each block; a reference index that specifies a reference picture for each reference list of each block; and a motion vector that represents a direction of movement and an amount of movement of a block. This coding information is encoded and/or decoded.
Further, according to MPEG-4 AVC/H.264, a direct mode, where inter prediction information of a block subject to encoding or decoding is generated from inter prediction information of a decoded block, is defined. Since encoding of inter prediction information is not required, encoding efficiency is improved with the direct mode.
An explanation will be given on a time direct mode that uses a correlation of inter prediction information over time, while referring to FIG. 42. A picture, of which the reference index of L1 is registered to 0, is used as a base picture colPic. A block at a position identical to a block subject to encoding or decoding in the base picture colPic is used as a base block.
If the base block is encoded by using L0 prediction, the motion vector of the base block L0 is used as a base motion vector mvCol. If the base block is not encoded by using L0 prediction but encoded by using L1 prediction, the motion vector of L1 of the base block is used as a base motion vector mvCol. A picture referred to by the base motion vector mvCol is used as a reference picture of L0 in a time direct mode, and the base picture colPic is used as a reference picture of L1 in a time direct mode.
A motion vector mvL0 of L0 and a motion vector mvL1 of L1 in a time direct mode are derived from a base motion vector mvCol by a scaling computation process.
By subtracting POC of the reference picture of L0 in a time direct mode from POC of the base picture colPic, a distance td between pictures is derived. POC is defined as a parameter that is associated with a picture subject to encoding, and a value that increases in the output order of pictures is set. A difference of POC between two pictures indicates a distance between pictures along the time axis.td=POC of a base picture colPic−POC of a reference picture of L0 in a time direct mode
By subtracting POC of the reference picture of L0 in a time direct mode from POC of a picture subject to encoding or decoding, a distance tb between pictures is derived.tb=POC of a picture subject to encoding or decoding−POC of a reference picture of L0 in a time direct mode
A motion vector mvL0 of L0 in a time direct mode is derived from a base motion vector mvCol by a scaling computation process.mvL0=tb/td*mvCol
A motion vector mvL1 of L1 is derived by subtracting the base motion vector mvCol from the motion vector mvL0 of L0 in a time direct mode.mvL1=mvL0−mvCol