As techniques for coding a moving image signal with a low bit rate, a high compression rate, and high quality to generate coded data and for decoding a coded moving image, H.261 and H.263 standardized by the ITU (International Telecommunication Union), MPEG-1, MPEG-2, and MPEG-4 specified by the ISO (International Organization for Standardization), and so forth are widely used as international standards.
It is known that H.264 (see Non-Patent Document 1) standardized jointly by the ITU and the ISO achieves better compression efficiency and image quality, compared with conventional moving image coding techniques.
In these moving image coding techniques, inter-frame predictive coding, which utilizes temporal correlation between individual frames, is widely used in order to efficiently compress a moving image signal.
In inter-frame predictive coding, an image signal of a previously coded frame is used to predict an image signal of a current frame, and a prediction error signal between the predicted signal and the current signal is coded.
In ordinary moving images, since there is a strong correlation between image signals of temporally close frames, this technique is effective in improving compression efficiency.
Moving image coding techniques such as MPEG (Moving Picture Expert Group)-1, MPEG-2, MPEG-4, and H.264 use a combination of the following pictures to code moving images:                an I picture, which does not use inter-frame predictive coding (Intra picture: Intra-frame coded image);        a P picture, which uses inter-frame predictive coding based on a previously coded frame (Predictive picture: Unidirectionally prodictive coded image); and        a B-picture, which uses inter-frame predictive coding, based on two previously coded frames (Bidirectionally predicative coded picture).        
In decoding, an I picture can be decoded by using a single frame thereof. However, since a P picture and a B picture need image data for inter-frame prediction in advance, they cannot be decoded with a single frame alone.
FIG. 13 schematically shows an example of the structure of pictures used in moving image coding. Each of the rectangles represents one frame, and the picture type and display order of each frame are shown below (for example, B5 is the fifth frame in the display order and is coded as a B picture). Thus, a moving image has conventionally been coded by suitably combining these I, P, and B pictures having different characteristics with each other.
As shown in FIG. 13, conventionally, when special reproduction such as high-speed reproduction or high-speed reverse reproduction (also referred to as reverse high-speed reproduction) is conducted on a bit stream of a coded moving image, an I-picture bit stream alone, which can be decoded by itself, is extracted from the bit stream, so as to reproduce the moving image.
As a related technique, FIGS. 14A to 14C schematically show an example of the operation of a method for obtaining a high-speed reproduction bit stream and a high-speed reverse reproduction bit stream. FIG. 15 shows a typical example of the structure of a apparatus realizing the method of FIGS. 14A to 14C.
In FIG. 15, a bit stream is supplied to a stream extraction unit 101. The stream extraction unit 101 extracts an I-picture bit stream alone from the supplied bit stream supplied and supplies the extracted stream to a stream arrangement unit 102.
The stream arrangement unit 102 arranges the supplied I-picture bit stream as needed, and outputs the stream to the outside.
For high-speed reproduction, the stream extraction unit 101 sequentially extracts an I-picture bit stream alone from the bit stream of FIG. 14A. The extracted I pictures are sequentially set to form a bit stream, whereby a high-speed reproduction bit stream is obtained (I0, I6, I12, I18 . . . in FIG. 14B, for example). For high-speed reproduction, no arrangement processing is carried out by the stream arrangement unit 102.
For high-speed reverse reproduction, similarly, the stream extraction unit 101 extracts I pictures alone from the bit stream, and the stream arrangement unit 102 arranges the extracted I pictures in the reverse display order and then outputs the pictures. In this way, a high-speed reverse reproduction bit stream is obtained ( . . . I18, I12, I6, I0 of FIG. 14C, for example). For example, by developing the above method, Patent Document 1 discloses a technique in which minimum I pictures alone necessary for display are extracted to generate a high-speed reproduction stream. Namely, according to the disclosure of Patent Document 1, for fast-forward reproduction, display frames are specified at certain intervals, and if a specified frame is an independent frame (I picture), the frame is used for display. However, if a specified frame is a dependent frame (a P picture or a B-picture), an independent frame (I picture) that is the most temporally adjacent to the frame is used for display.
The above method is also applicable to special reproduction of a bit stream coded with the moving image coding technique of H.264 standardized recently. However, as compared with conventional coding standards such as MPEG-1, MPEG-2, and MPEG-4, H.264 allows many coding variations, and thus, there are cases in which conventional methods cannot be applied. Such cases will be hereinafter described.
In H.264, frame coding suitable for a progressive image (image in progressive scan format) and field coding suitable for interlaced image (image in interlace scan format) can be appropriately selected and used.
As shown in FIG. 16A, in frame coding, the entire image is coded as a single picture (frame picture). In contrast, as shown in FIG. 16B, in field coding, an image is divided into odd and even lines, which are then coded as separate pictures (field pictures).
Further, in H.264, in the case of field coding, each of the two field pictures forming an image can be coded by different kinds of coding (I picture, P picture, or B picture).
FIG. 17 shows an example of a bit stream coded by field coding. In FIG. 17, each of the rectangles represents a single field picture, and two field pictures form a single frame. Also, an arrow indicates a reference relationship of inter-frame prediction, and this example shows that a field picture P13 uses an I12 for me prediction. In FIG. 17, reference relationships of interframe prediction between pictures other than the P13 are not shown.
In the example of FIG. 17, the I12 and the P13 form a single frame, and while the I12 is an I picture, the P13 is a P picture.
The structure shown in FIG. 15 cannot generate a high-speed reproduction bit stream and a high-speed reverse reproduction bit stream from the bit stream of FIG. 17.
That is, based on the structure shown in FIG. 15, while the stream extraction unit 101 extracts an I-picture bit stream alone from an input bit stream received, a stream of the I12 and the I24 alone from the bit stream shown in FIG. 17. Each of the pictures only includes image information for a single field. Thus, even when the extracted stream is decoded, decoding results forming one screen cannot be obtained.
Thus, if a high-speed reproduction bit stream is generated based on the structure shown in FIG. 15, decoding the high-speed reproduction bit stream may cause problems; for example,                only one side of fields (odd lines or even lines) in an image is updated, or        an image having a mixture of a past field and a current field is output.        
However, by developing the above method described with reference to FIG. 15 and the like, it is easily conceivable that the above problems could be solved by extracting not merely an I picture (field picture) but two field pictures including the I picture in case of field coding.
In this case, the stream extraction unit 101 of FIG. 15 extracts not only the I12 and the I24 but also the I12 and the P13 as well as the I24 and the P25 shown in FIG. 17 (approach 2).
With the use of this approach 2, the above problems can be solved in the case of the example shown in FIG. 17. That is, if the stream extraction unit 101 extracts the I12 and the P13 and decodes these two pictures, decoding results forming one screen can be obtained. Thus, a high-speed reproduction bit stream and a high-speed reverse reproduction bit stream can be created.
However, this is effective because the P13 uses only the I12 for inter-frame prediction in the example of FIG. 17. The approach 2 does not fundamentally solve the above problems, and the reason will be hereinafter described.
In H.264, there are many variations in inter-frame prediction. For example, such inter-frame prediction as shown in FIG. 18 is possible. That is, a field picture B13 uses not only an I12 but also a P18 and a P19 for inter-frame prediction.
In such case, even when the stream extraction unit 101 extracts the I12 and the B13 in accordance with the method 2, since the B13 cannot obtain image data necessary for inter-frame prediction (the P18 and the P19 in the example of FIG. 18), the B13 cannot be decoded. Thus, decoding result forming one screen cannot be obtained.
As a moving image decoding apparatus (method) to solve the above problems, for example, Patent Document 2 discloses image decoding apparatus (method) comprising a control means for controlling a decoding means. In a high-speed reproduction mode, the decoding means conducts decoding processing only on intra-coded image data coded by frame coding or on coded image data of one of a pair of fields in the intra-coded image, and intra-field coded image data coded by field coding. According to the invention disclosed in Patent Document 2, in a high-speed reproduction mode, a frame-coded I picture, one of the pair of fields in a frame-coded I picture, or a field-coded I picture alone are decoded, and an obtained one-filed decoded image is copied onto the other field to form one screen.
Relating to the above background art, Patent Document 3 discloses recording and reproducing apparatus, aiming to reduce the amount of information by extracting an I picture from an MPEG stream, re-encoding the I picture to generate an intermittent stream, recording the intermittent stream on a recording medium, reading the stream on the recording medium to decode the stream, and conducting re-encoding with the use of the I pictures alone. Patent Document 4 discloses a structure for obtaining a frame image by synthesizing an image of odd fields and an image of even fields of an interlace image data (a structure for directly processing an image). Further, Patent Document 5 discloses a structure in which a copy of one of a pair of fields forming one frame of a video signal (top field data, for example) is made as the other field data (bottom field image data) to generate a frame image, and the frame image is compressed and coded. Patent Document 6 discloses a structure in which a header analysis means analyzes coding mode information and outputs coding mode analysis information. The header analysis means identifies whether or not an inputted video stream is a bit stream obtained by coding a progressive scanning image with frame coding or is an image obtained by coding an interlace scanning image with frame or field coding, and the header analysis means then outputs the results as coding mode analysis information to a decoding control means. When the decoding control means decodes a first bit stream, obtained by coding an interlace scanning image with frame or field coding, based on the coding mode analysis information, the means alternately outputs a first timing signal indicating the start of decoding the top field and a second timing signal indicating the start of decoding the bottom field. When the decoding control means decode a second bit stream obtained by coding a progressive scanning image with frame coding, the means outputs a third timing signal indicating the start of decoding in the middle of a display start signal. Each of the inventions disclosed in the above Patent Documents 1 to 6 is utterly different from the present invention to be described below in any of the aspects of its object, constitution, and operation and effect.
Patent Document 1:
JP Patent Kokai Publication No. JP-A-05-344494
Patent Document 2:
JP Patent Kokai Publication No. JP-P2004-328634A
Patent Document 3:
JP Patent Kokai Publication No. JP-P2003-219362A
Patent Document 4:
JP Patent Kokai Publication No. JP-P2004-015700A
Patent Document 5:
JP Patent Kokai Publication No. JP-P2004-040516A
Patent Document 6:
JP Patent Kokai Publication No. JP-A-11-041591
Non-Patent Document 1:
ITU-T Recommendation H.264 “Advanced video coding for generic audiovisual services,” March 2005