1. Field of the Invention
The present invention relates to an image coding apparatus and a method thereof for compression-coding image data by using an inter-picture predictive coding.
2. Description of the Related Art
As a technology for high efficiency coding a moving image, coding methods, such as JPEG (Joint Photographic Experts Group) and MPEG (Moving Picture Experts Group)-1 and MPEG-2 by using a technology of motion prediction and compensation have become commercially practical. Various manufacturers are developing and commercially producing DVD (digital versatile disk) recorders or imaging apparatuses such as digital cameras or digital video cameras, which make the video data recordable by using the coding methods. A user can simply view and listen to the moving images using these apparatuses or personal computers, DVD players, etc.
The amount of data of digitized moving image turns into a huge amount. Therefore, a coding method of the moving image which can produce further high compression rather than the above-described MPEG-1 or MPEG-2 etc. is desired. Recently, a coding method called H.264/MPEG-4 part 10 (hereinafter, referred to as “H.264”) has been standardized by ITU-T (International Telecommunication Union-Telecommunication Standardization Sector) and ISO (International Organization for Standardization).
Here, in H.264, selection of a reference image used for the inter-picture (frame) prediction is explained with reference to FIGS. 11 and 12. FIGS. 11A to 11C and FIGS. 12A to 12C illustrate input image stream and picture type of each picture that comprises the input image stream. As the picture types in H.264, there are I pictures encoded without reference to other pictures, P pictures encoded using motion prediction from a past coded picture and used as a reference, and B pictures encoded using past and/or future pictures for motion prediction. In FIGS. 11 and 12, the upper part illustrates an image stream in the display order, and the lower part illustrates an image stream in the coding order. For example, in FIG. 11A, the P8 picture indicates that the P8 picture is displayed ninth. The arrow in the figure indicates a relationship of a reference picture, when coding. For example, in FIG. 11A, it is indicated that the P8 picture is referring to the B0 picture, and in FIG. 11B, it is indicated that the B0 picture is referring to the P2 and B7 pictures.
In MPEG-2 inter-picture prediction coding, the P picture of a coding object can only refer to an I picture or a P picture immediately before the P picture of the coding object, and the B picture of a coding object can only refer to an I picture or a P picture immediately before or after the B picture of the coding object. On the other hand, in H.264 inter-picture prediction coding, the picture of a coding object can refer to an arbitrary picture (however, a P picture is encoded only by forward prediction) and an arbitrary picture type in the image stream. For example, as shown in FIG. 11A, the P8 picture can refer to the B0 picture which precedes the I5 picture. As shown in FIG. 11B, the B0 picture can refer to the B7 picture which follows the I5 picture. Thus, since H.264 can select the reference image flexibly as compared with MPEG-2, the inter-picture prediction precision and coding efficiency of H.264 are superior to those of MPEG-2.
On the other hand, since H.264 permits the above flexible references, there is a possibility that it may become impossible to carry out random access arbitrarily. For example, in FIG. 11C, the case where the image stream is reproduced by the random access from the I5 picture which is a picture (frame) at a midpoint of the image stream is described below. When the P8 picture is decoded after starting reproduction from the I5 picture in the image stream, since the P8 picture is referring to the B0 picture when coding, the P8 picture needs the decoded image of the B0 picture when decoding. However, since reproduction is started from the I5 picture, the decoded image of the B0 picture cannot be obtained. Even if it is possible to decode the B0 picture beforehand, in that case, since the B0 picture is referring to the P2 and B7 pictures, it is necessary to also decode the P2 and the B7 pictures beforehand. Similarly, although not shown in the figure, since the P2 and the B7 pictures are also referring to other pictures, it is necessary to also decode other pictures beforehand. Thus, since the picture obtained by jumping over the I5 picture is permitted as the reference picture, even if it is a case where reproduction is started from the I5 picture, it is necessary to decode data of the picture preceding the I5 picture and it is difficult to start decoding promptly from the I5 picture.
Therefore, in order to solve such a problem in H.264 and to make random access possible, the method of setting restriction regarding the reference relationship of the image in the motion prediction for the I picture periodically is disclosed in Japanese Laid-open No. 2003-199112, for example. The conditional I picture is called an IDR (Instantaneous Decoder Refresh) picture in H.264.
Here, the IDR picture is described below with reference to FIGS. 12A-12C. The image streams indicated in FIGS. 12A and 12B are an example when setting the I5 picture as the IDR picture for the same streams as FIGS. 11A and 11B. In cases where the I5 picture is set as the IDR picture, when encoding the IDR picture, a frame memory which records the reference image is cleared. Therefore, the picture to be encoded following the IDR picture cannot refer to a picture which has been encoded preceding the IDR picture as shown in FIG. 12A. Similarly, the picture to be encoded preceding the IDR picture cannot refer to a picture which has been encoded following the IDR picture as shown in FIG. 12B. That is, the picture to be encoded preceding the IDR picture cannot refer to the picture which is encoded following the IDR picture, and the picture to be encoded following the IDR picture cannot refer to pictures which have been encoded preceding the IDR picture as shown in FIG. 12C.
By the above-described processing, since it will become unnecessary to decode the image data preceding the IDR picture if reproduction is started from the IDR picture, random access and smooth reproduction are realizable. Furthermore, since the reference of the picture obtained by jumping over the IDR picture is prohibited, the editing on the basis of the IDR picture is easily enabled.
In the above-described H.264, the random access reproduction can be quickly carried out by using the IDR picture which restricts the reference relationship for inter picture prediction. However, in order to carry out random access from arbitrary midpoints of the image stream, many IDR pictures must be set. Since the reference relation of the image is restricted by setting the IDR picture, the encoding efficiency falls with the increase of the number of the IDR picture.
Additionally, since the code amount of the IDR picture to be intra-picture coded is large, the IDR picture itself is considered to become a factor which reduces the encoding efficiency.
That is, if the encoding efficiency is taken into consideration, it will be desirable to hold down the setting of the IDR picture to a necessary minimum. In the example which sets the IDR picture periodically, since a picture unnecessary for the random access is also set as the IDR picture, the encoding efficiency has the problem of getting worse.