In decoding a moving picture, information is usually compressed by reducing redundancy of the moving picture in a spatial direction and a temporal direction. Inter picture prediction is a method used for reducing redundancy in a temporal direction. When a picture is coded using inter picture prediction, a forward picture or a backward picture of the picture in display order is referred to as a reference picture.
A motion amount from the reference picture is detected, and then the information is compressed by eliminating redundancy in a spatial direction using a difference value between a picture obtained by performing motion compensation and a current picture to be coded.
The H.264 standard, which was recently standardized, specifies that a moving picture is coded on a per-slice basis. A slice, which is smaller than a picture, is composed of a plurality of macroblocks. A picture is composed of one or more slices. A slice that that includes macroblocks for which intra picture prediction is performed using not any reference picture but only a current picture to be coded is called an I slice. A slice that includes a macroblock for which inter picture prediction is performed with reference to a previously decoded picture and a macroblock for which intra picture prediction is performed, is called a P slice. A slice that includes a macroblock for which inter picture prediction is performed with reference to two or less previously decoded pictures and a macroblock for which intra picture prediction is performed, is called a B slice.
A picture may be composed of slices of the types above. A picture that includes only I slices is called an I picture. A picture that includes only I slices and P slices is called a P picture. A picture that includes I slices, P slices, and B slices is called a B picture.
Hereinafter, pictures are used for making the present description clearly understandable, but the same holds true for slices.
In comparison with the MPEG-2 and the MPEG-4 standards, the H.264 standard imposes substantially relaxed constraints on reference pictures. A reference picture to be referred to for a macroblock that belongs to a P picture may be either forward picture or a backward picture in display order as long as the reference picture has been already decoded. Two or less pictures to be referred to for a macroblock that belongs to a B picture may be either forward picture or a backward picture in display order as long as the reference picture has been already decoded. These reference pictures may be of any type of pictures, an I picture, a P picture, and a B picture.
FIG. 1 is a schematic view illustrating relationships in prediction (reference relationships) among pictures in a conventional method mentioned above for coding a moving picture.
Each vertical line in FIG. 1 represents one picture. A letter lower right of each vertical line indicates the picture type (I, P, or B) of the picture. Each arrow in FIG. 1 indicates that a picture at the pointed end of the arrow is referred to as a reference picture when inter picture prediction and decoding is performed for a picture at the other end of the arrow.
For B pictures, not more than two pictures are referred to for each macroblock: a reference to one of the two pictures (L0) is called forward reference; and a reference to the other picture (L1) is called backward reference. For the forward reference, a forward picture in display order is not necessarily, but only preferentially referred to. For the backward reference, a backward picture in display order is not necessarily, but only preferentially referred to.
For P pictures, not more than one picture is referred to for each macroblock. This reference is necessarily forward reference L0. As in the case of the B pictures, the picture referred to is not necessarily a forward picture in display order. For example, a B picture B9, which is ninth from the most forward picture in FIG. 1, refers to the tenth picture, which is a P picture P10 that follows the B9 in display order, as a forward reference picture, and refers to the seventh picture, which a P picture P7 that precedes the B9 in display order, as a backward reference picture. In comparison with the MPEG-2 and the MPEG-4 standards, the H.264 standard imposes substantially relaxed constraints also on display order of pictures. The display order is determined regardless of order of decoding until a picture memory to store decoded pictures overflows.
FIG. 2 is a schematic view illustrating relationships between decoding order and display order of pictures in a method for coding moving pictures according to the H.264 standard.
The upper row represents pictures arranged in decoding order, and the lower row represents pictures arranged in display order. The arrows between the rows indicate the correspondence between the decoding order and the display order of the pictures. The display order is coded as an attribute of each picture. For example, a P picture P10 is displayed after a B picture B11 and a P picture P13 that are decoded after the P picture P10.
In addition, the H.264 standard allows selecting a direct mode as a coding mode in which a current macroblock to be coded does not have a motion vector for decoding a B picture. There are two kinds of direct mode: the temporal direct mode and the spatial direct mode. In temporal direct mode, a motion vector to be used for a current macroblocks to be coded is predicted and generated by executing a scaling process on the basis of inter-picture positional relationships in display order using a motion vector of another previously coded picture as a reference motion vector (see Patent Reference 1).
FIG. 3 is a schematic view illustrating a method for predicting and generating a motion vector in temporal direct mode.
Vertical lines represent pictures. Letters P and B of symbols upper light of these pictures indicate picture types of P picture and B picture, respectively. Numerals attached to these picture type symbols indicate numbers in decoding order of the pictures (these naming rules are applicable in the following description). The pictures P1, B3, B4, B5, and P2 have display time information T1, T2, T3, T4, and T5, respectively. Decoding a macroblock BL0 in the picture B5 in temporal direct mode is described below.
The picture P2 adjacent to the picture B5 in display time is a previously decoded picture (anchor picture) P2. A decoding process is performed using a motion vector MV1 of a macroblock (anchor macroblock) BL1 that is situated in the same position in the picture P2 as the macroblock BL0. The motion vector MV1 has been used for decoding the macroblock BL1 and refers to the picture P1. In this case, to decode the macroblock BL0, a motion vector MV_F is used with the picture P1, and a motion vector MV_B with the picture P2. Then, when the magnitude of the motion vector MV1 is MV, the magnitude of the motion vector MV_F, MVf, and the magnitude of the motion vector MV_B, MVb, are obtained using the following equations:MVf=(T4−T1)/(T5−T1)×MV; andMVb=(T5−T4)/(T5−T1)×MV. 
Motion compensation is thus performed for the macroblock BL0 on the basis of the reference pictures, the pictures P1 and P2, using the motion vectors MV_F and MV_B that are obtained by scaling the motion vector MV1.
The anchor macroblock BL1 does not have a motion vector when the anchor macroblock BL1 is an intra prediction macroblock. In this case, the motion vector MV1 is assumed to be zero. As a result, the pixel value of the macroblock BL0 is obtained as the mean value of pixel values in macroblocks that are situated in the same positions in the pictures P2 and B4.
In spatial direct mode, as in the case of the temporal direct mode, a current macroblock to be coded does not have a motion vector. Decoding is performed with reference to motion vectors of previously decoded macroblocks spatially adjacent to the macroblock currently being coded (see Patent Reference 2).
FIG. 4 is a schematic view illustrating a method for predicting and generating a motion vector in spatial direct mode.
Decoding a macroblock BL0 in the picture B5 in spatial direct mode is described below.
Macroblocks each including a pixel A, B, or C are situated adjacent to a current macroblock to be decoded, macroblock BL0, and each have motion vectors MVA1, MVB1, or MVC1. In this case, a motion vector that has referred to a previously decoded picture closest to the current picture among these motion vectors is selected as a candidate motion vector of a macroblock being coded. When three motion vectors are thus selected, the median value of these motion vectors is determined as a motion vector of the current macroblock to be coded. When two are selected, the mean value of them is calculated to obtain a motion vector of the current macroblock to be coded. FIG. 4 shows an example that a motion vector MVA1 and MVC1 are obtained with reference to a picture P2, and a motion vector MVB1 with reference to a picture P1.
Accordingly, the mean value of the motion vectors MVA1 and MVC1 referring to the picture P2, which is the previously decoded picture closest to the current picture to be decoded, is calculated to obtain a first motion vector of the current macroblock to be decoded, a motion vector MV_F. A second motion vector MV_B is obtained in the same manner.
In decoding a moving picture, there may be an error in a received compressed stream. When there is, the decoding may be stopped upon a detection of the error. However, this causes inconvenience of preventing a user from viewing the moving picture. Error concealment is then performed in order to minimize such an error while avoiding a stop of a decoding operation. Conventional error concealment is typically performed by substituting picture data of a picture adjacent to a current picture to be decoded having an error for picture data of the current picture to be decoded (see Patent Reference 3).
FIG. 5 is a schematic view illustrating a conventional method for error concealment. There are pictures P1, B1, B2, and P2. The P1 has been decoded normally, but an error is detected in the P2 and a decoding operation is stopped. In this case, according to Patent Reference 3, error concealment is performed by substituting the picture P1, which has been already decoded, for the pictures B1, B2, and P2. The substitutional picture may be referred to by other pictures and has been decoded last. This will show a user not B1, B2, or P2 but a picture temporally close to the picture having an error, so that the user will not perceive the error but view pictures as if the pictures had been smoothly decoded.    Patent Reference 1: Japanese Unexamined Patent Application Publication No. 11-75191    Patent Reference 2: International Application Publication WO 2004/008775    Patent Reference 3: Japanese Unexamined Patent Application Publication No. 10-23435