In the age of multimedia which integrally handles audio, video and other information such as pixel values, existing information media, i.e., newspapers, magazines, televisions, radios, telephones and other means through which information is conveyed to people, have recently come to be included in the scope of multimedia. Generally, multimedia refers to something that is represented by associating not only characters, but also graphics, voices, and especially pictures and the like together, but in order to include the aforementioned existing information media in the scope of multimedia, it appears as a prerequisite to represent such information in digital form.
However, when calculating the amount of information contained in each of the aforementioned information media as the amount of digital information, while the amount of information per character is 1˜2 bytes, the amount of information to be required for voice is 64 Kbits or over per second (telephone quality), and 100 Mbits or over per second for moving pictures (current television reception quality), and it is not realistic for the aforementioned information media to handle such an enormous amount of information as it is in digital form. For example, although video phones are already in the actual use via Integrated Services Digital Network (ISDN) which offers a transmission speed of 64 Kbps˜1.5 Mbps, it is not practical to transmit video information shot by television cameras directly through ISDN.
Against this backdrop, information compression techniques have become required, and moving picture compression techniques compliant with H.261 and H.263 standards recommended by ITU-T (International Telecommunication Union—Telecommunication Standardization Sector) are employed for video phones, for example. Moreover, according to information compression techniques compliant with the MPEG-1 standard, it is possible to store picture information in an ordinary music CD (compact disc) together with sound information.
Here, MPEG (Moving Picture Experts Group) is an international standard on digital compression of moving picture signals, and MPEG-1 is a standard for compressing television signal information approximately into one hundredth so that moving picture signals can be transmitted at a rate of 1.5 Mbps. Furthermore, since transmission speed within the scope of the MPEG-1 standard is limited primarily to about 1.5 Mbps, MPEG-2, which was standardized with a view to satisfy requirements for further improved picture quality, allows data transmission of moving picture signals at a rate of 2˜15 Mbps.
Furthermore, MPEG-4 which provides a higher compression ratio has been standardized by the working group (ISO/IEC JTC1/SC29/WG11) which was engaged in the standardization of MPEG-1 and MPEG-2. Not only is it possible to perform a highly efficient coding at a low bit rate, MPEG-4 employs a powerful technique for error resilience which lessens the degradation of picture quality to be judged from a subjective standpoint, even when a transmission channel error occurs. Also, ITU-T has started work for standardization of H.26L as a next-generation picture coding method.
MPEG-1, MPEG-2 and MPEG-4 have allowed substantial improvement of compression ratio using inter predictive picture coding (hereinafter referred to as inter picture coding) for coding or decoding a differential value between a current picture to be coded or decoded and a reference picture (a picture signal of a picture which has been coded or decoded most recently) with reference to the reference picture (See, for example, ISO/IEC 13818-2 “INTERNATIONAL STANDARD Information technology—Generic coding of moving pictures and associated audio information: Video”, Dec. 15, 2000, p. 7, Intro. 4.1.1).
In addition, compression of the amount of information can be realized by reducing redundancies in the temporal and spatial directions. In the inter predictive picture coding which aims at reducing the temporal redundancies, a predictive picture is created with reference to previously coded or decoded pictures (reference pictures) and the differential value between the resulting predictive picture and a current picture to be coded is coded. Here, a picture is a term representing one sheet of an image, and specifically, a picture means a frame in a progressive image and a frame or a field in an interlace image.
As of September, 2001, the proposed H.26L standard allows not only reference only to a picture which has been coded or decoded immediately before a current picture to be coded or decoded, but also reference to an arbitrary picture selected, as a reference picture, from a plurality of pictures which have been coded or decoded prior to the current picture.
FIG. 1 shows an illustration of a concept of a conventional moving picture coding method and a moving picture decoding method. FIG. 1 is an example in which an arbitrary picture is selected as a reference picture from 3 pictures preceding a current picture to be coded or decoded. In FIG. 1, pictures are ordered in display order, and the display time of the picture at the far left is earliest. The pictures are also coded in this order from the left. Therefore, in a bit stream, the pictures are also ordered in the order of Picture J1, Picture J2, Picture J3 and Picture J4. When a current picture to be coded or decoded is Picture J4, it is possible to select one of these Picture J1, Picture J2 and Picture J3 as a reference picture, and when a current picture to be coded or decoded is Picture J5, it is possible to select one of these Picture J2, Picture J3 and Picture J4 as a reference picture.
FIG. 2 is a block diagram showing the structure of a conventional moving picture coding apparatus.
A moving picture coding apparatus 4 is an apparatus for compressing and coding an inputted picture signal Vin so as to output a coded picture signal Str in the form of a bit stream transformed by variable length coding or the like, and is comprised of a motion estimation unit 401, a selection unit 402, a picture signal subtraction unit 403, a coding unit 404, a decoding unit 405, an addition unit 406, a selection unit 407 and memories 408˜410.
The motion estimation unit 401 reads out previously coded reference pictures which are stored in the memories 408˜410 respectively and compare them with the inputted picture signal Vin so as to determine motion information MV indicating the reference picture Ref whose inter picture differential value (error energy) is smaller and the pixel location for making the inter picture differential value smaller. The reference picture Ref and the pixel location for making the error energy smallest is usually determined, but recently a method has been used for determining the motion information MV not just for making the error energy smallest but for making the error energy smaller and making the compression ratio larger. Note that the information of the reference picture Ref and the pixel location is hereinafter referred to as motion information MV collectively. The selection unit 402 outputs the reference picture selected from among a reference picture Ref1, a reference picture Ref2 and a reference picture Ref3 which are stored in the memories 408˜410, based on a reference picture instruction signal RefFrm that is a switching instruction signal. The subtraction unit 403 calculates a differential picture signal Dif between a picture signal Vin and a reference picture Ref.
The coding unit 404 codes the differential picture signal Dif and the motion information MV that is the information for identifying a reference picture. The decoding unit 405 decodes coded data Coded which has been coded by the coding unit 404 to obtain a reconstructed differential picture signal RecDif. The addition unit 406 adds the reference picture Ref and the reconstructed differential picture signal RecDif. The selection unit 407 outputs the inputted decoded picture signal Recon to any of the memories 408˜410 as a decoded picture signal Rec1, a decoded picture signal Rec2 or a decoded picture signal Rec3, so as to enable the decoded picture signal to be referred to for coding the following pictures.
Next, the operation of the moving picture coding apparatus structured as above will be explained.
The picture signal Vin is inputted to the picture signal subtraction unit 403 and the motion estimation unit 401. The motion estimation unit 401 reads out the reference picture Ref1, the reference picture Ref2 and the reference picture Ref3 which are previously decoded pictures stored in the memories 408˜410, compares them with the inputted picture signal Vin so as to determine a reference picture whose inter picture differential value is smallest, and outputs the motion information MV that is the information for identifying the reference picture and the pixel location to be referred to.
At the same time, the motion estimation unit 401 outputs a reference picture instruction signal RefFrm that is a switching instruction signal so that the selection unit 402 can select a reference picture corresponding to the motion information MV and output it as a reference picture Ref. Note that since a scene change or the like causes a loss of correlation between the pictures, the compression ratio of an inter coded picture could decline more than an intra coded picture (or an intra picture) which can be reconstructed only with a coded picture signal of a picture to be coded. In this case, the motion estimation unit 401 indicates the intra picture coding by the motion information MV and outputs a reference picture instruction signal RefFrm for outputting a reference picture Ref4 which always outputs a value 0 as a reference picture Ref. Note that the value of the reference picture Ref4 does not always need to be 0 and may be an average value 128 in the case of a luminance signal or an RGB color signal whose value is 0˜255, for example.
Also, in order to prevent error propagation or enable reproduction to start from a picture at some midpoint in a coded picture signal, a picture in every predetermined number of pictures needs to be intra coded so as to be reconstructed only with a coded picture signal of a picture to be coded. So, the motion estimation unit 401 can switch into the intra picture coding forcibly according to the instruction of an intra picture coding instruction signal Reset given from outside.
On the other hand, the subtraction unit 403 calculates the difference between this picture signal Vin and the reference picture Ref selected by the selection unit 402, and outputs the differential picture signal Dif to the coding unit 404. Next, the coding unit 404 codes the differential picture signal Dif and the motion information MV outputted from the motion estimation unit 401, and outputs the coded picture signal Str and the coded data Coded. Here, the coded data Coded is data necessary for reconstructing a picture, and the coded picture signal Str is a bit stream of the coded data Coded transformed by variable length coding or the like.
The decoding unit 405 decodes the coded data Coded and outputs the reconstructed differential picture signal RecDif to the addition unit 406. The addition unit 406 adds the reconstructed differential picture signal RecDif and the reference picture Ref selected by the selection unit 402, and outputs the decoded picture signal Recon to the selection unit 407. The selection unit 407 outputs the decoded picture signal Recon to any of the memories 408˜410 as a decoded picture signal Rec1, a decoded picture signal Rec2 or a decoded picture signal Rec3 so that the decoded picture signal Recon can be referred to as a reference picture for coding the following pictures. In this example, the selection unit 407 switches the memories so that the picture which has been stored in any of these memories at the earliest time is overwritten by a new decoded picture signal Recon.
FIG. 3 is a block diagram showing the structure of a conventional moving picture decoding apparatus.
A moving picture decoding apparatus 5 is an apparatus for decoding a coded picture signal Str which has been coded by the moving picture coding apparatus 4.
A decoding unit 501 decodes the inputted coded picture signal Str and outputs a reconstructed differential picture signal RecDif and motion information MV. A motion reconstruction unit 502 decodes the motion information MV and outputs a reference picture instruction signal RefFrm. Operations of a selection unit 503, a selection unit 505 and memories 506˜508 are same as those of the selection unit 402, the selection unit 407 and the memories 408˜410 of the moving picture coding apparatus 4 as shown in FIG. 2. The addition unit 504 adds the reconstructed differential picture signal RecDif and the reference picture Ref to output a decoded picture signal Vout (which corresponds to a decoded picture signal Recon in FIG. 2).
Note that in the above-mentioned moving picture coding apparatus 4 and the moving picture decoding apparatus 5, motion compensation units not shown in the figures are provided on the output sides of the selection unit 402 and the selection unit 503, respectively, and perform motion compensation for generating pixel values with pixel location precision in decimal degree which are pixel values with ½ pixel location precision or the like for interpolating the pixel values of the reference picture outputted from the memory.
By the way, in the above-mentioned conventional moving picture coding apparatus and moving picture decoding apparatus, it is not distinguished at all whether the reference picture is an intra coded picture or one of inter coded pictures following the intra coded picture. For example, in the illustration of FIG. 1 explaining the concept of the conventional moving picture coding method and moving picture decoding method, Picture J2 is an intra coded picture and Picture J1, Picture J3, Picture J4 and Picture J5 are inter coded pictures, but Picture J1 can be referred to as a reference picture for Picture J4. If Picture J4 refers to Picture J1 as a reference picture, it means that Picture J4 refers to, as a reference picture, the picture preceding the intra coded Picture J2.
However, when starting reproduction from a picture at some midpoint, for example, when starting decoding and reproduction from the intra coded Picture J2 at a midpoint in a coded picture signal, the decoding of Picture J4 requires reference to the decoded Picture J1. Therefore, a problem occurs that the pictures following Picture J4 cannot be correctly decoded.
Also, for example, if a stream error occurs at a midpoint in a coded picture signal and Picture J1 cannot be correctly decoded due to the error, a problem occurs that the pictures following Picture J4 cannot be correctly decoded because the decoding of Picture J4 requires reference to Picture J1, although the intra coded Picture J2 can be correctly decoded.
And so, the present invention has been conceived in view of the above-mentioned circumstances, and aims at providing a moving picture coding method, a moving picture decoding method and the like for making it possible to start reproduction from an intra coded picture at a midpoint in a coded picture signal and thus reproduce the pictures following the intra coded picture without error even if a stream error may occur.