With the development of multimedia applications, it has become popular to handle integrally all kinds of media information such as video, audio and text. For that purpose, digitalization of all these kinds of media allows handling of them in an integral manner. However, since digitized images have an enormous amount of data, image information compression techniques are absolutely essential for storage and transmission of such information. It is also important to standardize such compression techniques for interoperation of compressed image data. There exist international standards for image compression techniques, such as H.261 and H.263 standardized by International Telecommunication Union-Telecommunication Standardization Sector (ITU-T) and MPEG-1, MPEG-4 and others standardized by International Organization for Standardization (ISO). ITU is now working for standardization of H.26L as the latest standard for image coding.
In coding of moving pictures, in general, information amount is compressed by reducing redundancies in both temporal and spatial directions. Therefore, in inter-picture prediction coding, which aims at reducing the temporal redundancy, motion of a current picture is estimated on a block-by-block basis with reference to preceding or subsequent pictures so as to create a predictive image, and then differential values between the obtained predictive images and the current picture are coded.
Here, the term “picture” represents a single sheet of an image, and it represents a frame when used in a context of a progressive image, whereas it represents a frame or a field in a context of an interlaced image. The interlaced image here is a single frame that is made up of two fields having different times respectively. In the process of coding and decoding the interlaced image, a single frame can be handled as a frame, as two fields, or as a frame structure or a field structure on every block in the frame.
The following description will be given assuming that a picture is a frame in a progressive image, but the same description can be given even assuming that a picture is a frame or a field in an interlaced image.
FIG. 30 is a diagram for explaining types of pictures and reference relations between them.
A picture like a picture I1, which is intra-picture prediction coded without reference to any pictures, is referred to as an I-picture. A picture like a picture P10, which is inter-picture prediction coded with reference to only one picture, is referred to as a P-picture. And a picture, which can be inter-picture prediction coded with reference to two pictures at the same time, is referred to as a B-picture.
B-pictures, like pictures B6, B12 and B18, can refer to two pictures located in arbitrary temporal directions. Reference pictures can be designated on a block-by-block basis, on which motion is estimated, and they are discriminated between a first reference picture which is described earlier in a coded stream obtained by coding pictures and a second reference picture which is described later in the coded stream.
However, it is required in order to code and decode the above pictures that the reference pictures be already coded and decoded. FIGS. 31A and 31B show examples of order of coding and decoding B-pictures. FIG. 31A shows a display order of the pictures, and FIG. 31B shows a coding and decoding order reordered from the display order as shown in FIG. 31A. These diagrams show that the pictures are reordered so that the pictures which are referred to by the pictures B3 and B6 are previously coded and decoded.
A method for creating a predictive image in the case where the above-mentioned B-picture is coded with reference to two pictures at the same time will be explained in detail using FIG. 32. Note that a predictive image is created in decoding in exactly the same manner.
The picture B4 is a current B-picture to be coded, and blocks BL01 and BL02 are current blocks to be coded belonging to the current B-picture. Referring to a block BL11 belonging to the picture P2 as a first reference picture and BL21 belonging to the picture P3 as a second reference picture, a predictive image for the block BL01 is created. Similarly, referring to a block BL12 belonging to the picture P2 as a first reference picture and a block BL22 belonging to the picture P1 as a second reference picture, a predictive image for the block BL02 is created (See Non-patent document 1).
FIG. 33 is a diagram for explaining a method for creating a predictive image for the current block to be coded BL01 using the referred two blocks BL11 and BL21. The following explanation is made assuming here that a size of each block is 4 by 4 pixels. Assuming that Q1(i) is a pixel value of BL11, Q2(i) is a pixel value of BL21 and P(i) is a pixel value of the predictive image for the target BL01, the pixel value P(i) can be calculated by a linear prediction equation like the following equation 1. “i” indicates the position of a pixel, and in this example, “i” has values of 0 to 15.P(i)=(w1×Q1(i)+w2×Q2(i))/pow(2,d)+c  Equation 1
(where pow(2, d) indicates the “d”th power of 2)
“w1”, “w2”, “c” and “d” are coefficients for performing linear prediction, and these four coefficients are handled as one set of weighting coefficients. This weighting coefficient set is determined by a reference index designating a picture referred to by each block. For example, four values of w1_1, w2_1, c_1 and d_1 are used for BL01, and w1_2, w2_2, c_2 and d_2 are used for BL02, respectively.
Next, reference indices designating reference pictures will be explained with reference to FIG. 34 and FIG. 35. A value referred to as a picture number, which increases one by one every time a picture is stored in a memory, is assigned to each picture. In other words, a picture number with a value added one to the maximum value of the existing picture numbers is assigned to a newly stored picture. However, a reference picture is not actually designated using this picture number, but using a value referred to as a reference index which is defined separately. Indices indicating first reference pictures are referred to as first reference indices, and indices indicating second reference pictures are referred to as second reference indices, respectively.
FIG. 34 is a diagram for explaining a method for assigning two reference indices to picture numbers. When there is a sequence of pictures ordered in display order, picture numbers are assigned in coding order. Commands for assigning the reference indices to the picture numbers are described in a header of a slice that is a subdivision of a picture, as the unit of coding, and thus the assignment thereof are updated every time one slice is coded. The command indicates the differential value between a picture number assigned a reference index currently and a picture number assigned a reference index immediately before the current assignment, in series by the number of reference indices.
Taking the first reference index in FIG. 34 as an example, since “−1” is given as a command first, 1 is subtracted from the picture number 16 of the current picture to be coded and thus the reference index 0 is assigned to the picture number 15. Next, since “−4” is given, 4 is subtracted from the picture number 15 and thus the reference index 1 is assigned to the picture number 11. The following reference indices are assigned to respective picture numbers in the same processing. The same goes for the second reference indices.
FIG. 35 shows the result of the assignment of the reference indices. The first reference indices and the second reference indices are assigned to respective picture numbers separately, but focusing attention to each reference index, it is obvious that one reference index is assigned to one picture number.
Next, a method for determining weighting coefficient sets to be used will be explained with reference to FIG. 36 and FIG. 37.
A coded stream of one picture is made up of a picture common information area and a plurality of slice data areas. FIG. 36 shows a structure of one slice data area among them. The slice data area is made up of a slice header area and a plurality of block data areas. As one example of a block data area, block areas corresponding to BL01 and BL02 in FIG. 32 are shown here.
“ref1” and “ref2” included in the block BL01 indicate the first reference index and the second index indicating two reference pictures for this block, respectively. In the slice header area, data (pset0, pset1, pset2, pset3 and pset4) for determining the weighting coefficient sets for the linear prediction are described for ref1 and ref2, respectively. FIG. 37 shows tables of the above-mentioned data included in the slice header area as an example.
Each data indicated by an identifier “pset” has four values, w1, w2, c and d, and is structured so as to be directly referred to by the values of ref1 and ref2. Also, in the slice header area, a command sequence idx_cmd1 and idx_cmd2 for assigning the reference indices to the picture numbers are described.
Using ref1 and ref2 described in BL01 in FIG. 36, one set of weighting coefficients is selected from the table for ref1 and another set of them is selected from the table for ref2. By performing linear prediction of the equation 1 using respective weighting coefficient sets, two predictive images are generated. A desired predictive image can be obtained by averaging these two predictive images on a per-pixel basis.
In addition, there is a method for obtaining a predictive image using a predetermined fixed equation unlike the above-mentioned method for generating a predictive image using a prediction equation obtained by weighting coefficient sets of linear prediction coefficients. In the former method, in the case where a picture designated by a first reference index appears later in display order than a picture designated by a second reference index, the following equation 2a being a fixed equation composed of fixed coefficients is selected, and in other cases, the following equation 2b being a fixed equation composed of fixed coefficients is selected, so as to generate a predictive image.P(i)=2×Q1(i)−Q2(i)  Equation 2aP(i)=(Q1(i)+Q2(i))/2  Equation 2b
As is obvious from the above, this method has the advantage that there is no need to code and transmit the weighting coefficient sets to obtain the predictive image because the prediction equation is fixed. This method has another advantage that there is no need to code and transmit a flag for designating the weighting coefficient sets of linear prediction coefficients because the fixed equation is selected based on the positional relationship between pictures. In addition, this method allows significant reduction of an amount of processing for linear prediction because of a simple formula for linear prediction.
(Nonpatent Document 1)    ITU-T Rec. H.264|ISO/IEC 14496-10 AVC    Joint Committee Draft (CD)    (2002-5-10)    (P.34 8.4.3 Re-Mapping of frame numbers indicator, P.105 11.5 Prediction signal generation procedure)
In the method for creating a predictive image using weighting coefficient sets based on the equation 1, since the number of commands for assigning reference indices to reference pictures is same as the number of the reference pictures, only one reference index is assigned to one reference picture, and thus the weighting coefficient sets used for linear prediction of the blocks referring to the same reference picture have exactly the same values. There is no problem if images change uniformly in a picture as a whole, but there is a high possibility that the optimum predictive image cannot be generated if respective images change differently. In addition, there is another problem that the amount of processing for linear prediction increases because the equation includes multiplications.