In the age of multimedia which integrally handles audio, video and other information, existing information media, i.e., newspaper, magazine, television, radio, telephone and other means through which information is conveyed to people, have recently come to be included in the scope of multimedia. Generally, multimedia refers to something that is represented by associating not only characters, but also graphics, sound, and especially images, and the like, together, but in order to include the aforementioned existing information media in the scope of multimedia, it becomes a prerequisite to represent such information in digital form.
However, if we appraise the amount of information carried by each of the aforementioned information media as the amount of digital information, while the amount of information for 1 character in the case of text is 1 to 2 bytes, the amount of information required for voice is 64Kbits per second (telephone quality), and 100Mbits or over per second becomes necessary for moving pictures (current television reception quality), it is not realistic for the aforementioned information media to handle such an enormous amount of information as it is in digital form. For example, although video phones are already in actual use via Integrated Services Digital Network (ISDN) which offers a transmission speed of 64Kbps/s to 1.5Mbps/s, it is not practical to transmit video shot by television cameras directly through ISDN.
Against this backdrop, information compression techniques have become required, and for example, in the case of the video phone, the H.261 and H.263 standards for moving picture compression technology, internationally standardized by the International Telecommunication Union—Telecommunication Standardization Sector (ITU-T), are being employed. Moreover, with MPEG-1 standard information compression techniques, it has also become possible to store video information onto normal music compact discs (CD) together with audio information.
Here, Moving Picture Experts Group (MPEG) is an international standard for moving picture signal compression, and MPEG-1 is a standard for compressing moving picture signals up to 1.5Mbps, in other words, compressing television signals up to approximately a hundredth of the original size. Moreover, since transmission speed within the scope of the MPEG-1 standard is limited primarily to about 1.5Mbps, the use of MPEG-2, which was standardized to satisfy demands for further improved picture quality, allows moving picture signals to be compressed to 2 to 15Mbps. Furthermore, at present, MPEG-4, which has exceeded MPEG-1 and MPEG-2 compression ratios, and also enables coding, decoding and operating on a per-object base, and realizes the new functions required for the multimedia age, has been standardized by the work group (ISO/IEC JTC1/SC29/WG11) that has promoted the standardization of MPEG-1 and MPEG-2. MPEG-4 was initially aimed at standardizing a low bit rate coding method, but at present, this has been expanded to the standardization of a more versatile coding method further including high bit rate coding for interlaced images, and others.
As for B-picture (hereinafter referred to as “picture”, whenever a still picture, which is one picture within a moving picture, can either be a “frame” or “field”) coding in the case of moving picture coding methods for MPEG-4, H.26L, and others, a coding mode known as Direct Mode can be selected. (See MPEG-4 visual written standards (1999, ISO/IEC 14496-2:1999 Information technology—Coding of audio-visual objects—Part 2: Visual, p.154)). FIG. 1 is a diagram showing an example of an inter picture prediction method in an existing direct mode. FIG. 1 shall be referred to in explaining an inter picture prediction coding method in direct mode. It is now assumed that a block “a” of a picture B3 is coded/decoded in direct mode. In this case, when the picture B3 is coded/decoded in the H.26L standard, the motion vectors of a block, co-located with the block “a”, within the reference picture whose second reference index (a reference index is also referred to as a “relative index”. A reference index will be discussed later) is “0”, shall be used. Here, it is assumed that with regard to the picture B3, a picture P4 is the reference picture whose second reference index is “0”. In this case, a motion vector “c” of the block “b” within the picture P 4 shall be used. The motion vector “c” is the motion vector used during the coding/decoding of the block “b”, and refers to a picture P1. For the block “a”, a bi-prediction from the reference pictures P1 and P4 is carried out using a motion vector parallel to the motion vector “c”. The motion vectors in this case where the block “a” is coded/decoded shall be a motion vector “d” for the picture P1, and a motion vector “e” for the picture 4.
FIG. 2 is a chart showing an example of the assignment of picture numbers, as well as reference indices for each picture inputted. Picture number and reference indices are numbers for uniquely identifying reference pictures stored in the reference picture memory. For each picture stored in memory as a reference picture, a number incrementing by the value of “1” is assigned as a picture number.
FIG. 3 is a conceptual diagram showing the format of a picture coded signal in an existing moving picture coding, moving picture decoding method. “Picture” stands for a coded signal for one picture, “Header” is a header coded signal included in a picture head, “Block 1” is the coded signal of a block coded by direct mode, “Block 2” is the coded signal of a block by an interpolation (motion compensation) prediction other than direct mode, “Rldx0” and “Rldx1” are reference indices, “MV0” and “MV 1” represent motion vectors. For the interpolation (motion compensation) predictive block, Block 2, the two reference indices for indicating the two reference pictures (a first reference picture and a second reference picture) used for interpolation (motion compensation), Rldx0 and Rldx1, are contained, in this order, within a coded signal. Which of the reference indices Rldx0 or Rldx1 shall be used can be determined based on PredType. For example, in the case where PredType has indicated that a picture shall be referred bi-directionally, Rldx0 and Rldx1 are applied. Where it is indicated that a picture shall be uni-directionally referred, either Rldx0 or Rldx1 is applied, and in the case where direct mode is indicated, neither Rldx0 nor Rldx1 is applied. The reference index Rldx1, which indicates the first reference picture, shall be known as the first reference index and, the reference index Rldx1, indicating the second reference picture, shall be known as the second reference index. The first and second reference pictures are identified based on the position of data in a bit stream.
From here, FIG. 2A is referred to in explaining a method for assigning the first and second reference indices.
For the value of the first reference index, first, values starting from “0” shall be assigned to reference pictures having a display time earlier than a current picture to be coded/decoded, in the order of proximity to the current picture to be coded/decoded. When all the reference pictures having a display time earlier than the current picture to be coded/decoded have been assigned values starting from “0”, the continuing values are then assigned to the reference pictures having a display time later than the current picture to be coded/decoded, in the order of proximity to the current picture to be coded/decoded.
For the value of the second reference index, first, values starting from “0” shall be assigned to reference pictures having a display time later than the current picture to be coded/decoded, in the order of proximity to the current picture to be coded/decoded. When all the reference pictures having a display time later than the current picture to be coded/decoded have been assigned values starting from “0”, the continuing values are then assigned to the reference pictures having a display time earlier than the current picture to be coded/decoded, in the order of proximity to the current picture to be coded/decoded.
In FIG. 2A, where a first reference index Rldx0 is “0” and a second reference index Rldx1 is “1”, the first reference picture is a B-picture with a picture number “14”, and the second reference picture is a B-picture with a picture number “13”.
A reference index within a block is expressed by variable length code words, where the smaller the value of the reference index is, the shorter the code length of the code assigned is. Since, normally, the possibility of the picture closest to the current picture to be coded/decoded being chosen as a reference picture for inter picture prediction is high, coding efficiency will increase if the reference index values are assigned in the order of proximity to the current picture to be coded/decoded, as described above.
On the other hand, by indicating a change (remapping) in the assignment of reference indices using the buffer control signal within the coded signal (See FIG. 3, RPSL within Header), it is possible to arbitrarily change the reference picture assignment for the reference indices. Accordingly, with this change of assignment, it now becomes acceptable to appoint any reference picture within the picture memory, as the reference picture with a second reference index as “0”. For example, as shown in FIG. 2B, it is also possible to change the reference index assignment for picture numbers, so as to allow a reference picture with a second reference index as “0” to become the reference picture having a display time immediately preceding the current picture to be coded/decoded.
In addition, in the example given in FIG. 2A and FIG. 2B, a case where a B-picture is referred to during the coding/decoding of another picture is shown, but in general, coding is more often performed under the conditions listed below.                (1) A B-picture is not to be referred to by another picture.        (2) For each block of a B-picture, motion compensation is performed with reference to two pictures which are arbitrarily chosen as reference pictures from among the N (N being a positive integer) number of P-pictures (or I-pictures) immediately preceding in display order and, a single P-picture (or I-picture) immediately subsequent in display order.        
FIG. 4A is a diagram showing an example default setting of reference indices for a current picture to be coded B11, where a B-picture is coded with four preceding (N=4) P-pictures and a single, subsequent P-picture as reference pictures. In FIG. 4A, the difference with the example shown in FIG. 2A is that, as B-pictures are not referred to by other pictures, no reference indices are assigned to B-pictures and, only P-pictures (and I-pictures) are assigned reference indices. For example, since the picture B11 can use the four P-pictures immediately preceding it in display order, and the single P-picture immediately subsequent as reference pictures, reference indices will only be assigned to a picture P0, a picture P1, a picture P4, a picture P7, as well as a picture P10.
In the example shown in FIG. 4A, for the picture B11, the reference picture with a first reference index as “0” is the picture P7, and the reference picture with a second reference index as “0” is the picture P10. The picture P10 is located after the picture B11 in display order, and is the closest P-picture to the picture B11. Even under the above-mentioned conditions, it is possible to flexibly change the assignment of reference pictures for reference indices. FIG. 4B is a diagram showing an example of the reference indices for the picture B11 in the case where a remapping has been performed on the reference indices shown in FIG. 4A. As shown in FIG. 4B, in the H.26L standard, it is possible to re-assign the value “0” of the first reference index assigned to the picture P7 in the default setting, to the picture P1, and likewise, to re-assign the value “0” for the second reference index assigned to the picture P10 in the default setting, to the picture P0. It is possible to remap reference indices freely based on the coding efficiency, and other factors, of the subject B-picture.
In this manner, since it is possible to freely change the assignment of reference indices for reference pictures, a change can be normally done so that, the picture that is chosen to be the reference picture for improving the coding efficiency of a current picture to be coded, can be assigned an even smaller reference index. In other words, since a reference index within a block is expressed by variable length code words, where the code length of the code to be assigned becomes shorter as the value becomes smaller, the assignment of an even smaller reference index to a picture to which reference allows improvement of coding efficiency reduces the amount of coding for a reference index, thus allowing a further improvement in coding efficiency.
In the above-mentioned existing method, the motion vector of the reference picture with a second reference index as “0” is used for processing a block in a B-picture in direct mode. As such, during the process of coding/decoding a B-picture, it becomes necessary to store the motion vector of the reference picture with a second reference index as “0”. However, during the decoding process, in particular, until the processing of the bit stream of a current B-picture to be decoded is started, it is not known which reference picture is the picture with a second reference index as “0”. This is because it is possible to arbitrarily change the assignment of reference indices for the reference pictures, through the explicit instruction of the buffer control signal (See FIG. 3, RPSL within Header). Accordingly, in the process of coding/decoding a B-picture, it becomes necessary to store the motion vectors of all reference pictures. As such, in an existing method, as picture size gets larger, and, as reference pictures increase in number, there is a problem of an explosive expansion of memory size needed for storing motion vectors.
The present invention is conceived to solve the above-mentioned problem, and the object thereof is to provide a moving picture coding method and a moving picture decoding method for direct mode, that enables reduction of memory size for motion vectors.