Various methods of compressing and recording image data have conventionally been proposed. There is newly proposed MPEG-4 part-10: AVC (ISO/IEC 14496-10, also called H.264) (to be referred to as H.264 hereinafter).
An H.264 compression procedure will be explained with reference to FIG. 7. In FIG. 7, input image data is divided into macroblocks. A difference from a predicted value is obtained by a subtracter 701, undergoes integer DCT transform by a transformer 702, and is quantized by a quantizer 703. The quantized data is sent as difference image data to an entropy encoder 715.
At the same time, the quantized data is dequantized by a dequantizer 704, and undergoes inverse integer DCT transform by an inverse transformer 705. A predicted value is added to the resultant data by an adder 706 to reconstruct an image. The reconstructed image is sent to a frame memory 707 for intra prediction. At the same time, the reconstructed image is supplied to a deblocking filter 709, undergoes deblocking filter processing, and then sent to a frame memory 710 for inter prediction.
An image for intra prediction that is stored in the frame memory 707 is used for intra prediction by an intra prediction circuit 708. In intra prediction, the value of an adjacent pixel of an encoded block in the same picture is used as a predicted value.
An image for inter prediction that is stored in the frame memory 710 is made up of a plurality of pictures, which will be described later. Prediction pictures are classified into two lists List_0 and List_1, and used for inter prediction by an inter prediction circuit 711.
After prediction, images in the frame memory 710 are updated by a memory controller 713. In inter prediction, a motion detector 712 executes motion detection for image data of a different frame to obtain an optimal motion vector. A predicted image is determined using the optimal motion vector. As a result of intra prediction and inter prediction, an optimal prediction is selected by a switching circuit 714. An intra prediction mode or predicted value is supplied to the entropy encoder (e.g., variable-length encoder) 715, and encoded together with difference image data to form an output bitstream. An outline of the H.264 compression procedure has been described, and its contents are disclosed in detail in a standard specification. Other prior art references which disclose H.264 compression procedures also exist (see, e.g., Japanese Patent Laid-Open No. 2005-5844).
Next, H.264 inter prediction will be explained in detail with reference to FIGS. 8 to 11. In H.264 inter prediction, a plurality of pictures can be used for prediction. Two lists (List_0 and List_1) are prepared to specify a reference picture. Five reference pictures can be assigned to each list at maximum.
For P pictures, only List_0 is used to mainly perform forward prediction. For B pictures, List_0 and List_1 are used to perform bidirectional prediction (or only forward or only backward prediction). That is, pictures mainly for forward prediction are assigned to List_0, and pictures mainly for backward prediction are contained in List_1.
FIG. 8 shows an example of a reference list in coding. In FIG. 8, reference numeral 801 denotes image data which are arranged in the display order. Each rectangle shows the type of picture and a number representing the display order. I15 is an I picture whose display order is 15, and undergoes only intra prediction. P18 is a P picture whose display order is 18, and undergoes only forward prediction. B16 is a B picture whose display order is 16, and undergoes bidirectional prediction. The coding order is different from the display order, and data are encoded in the prediction order. In FIG. 8, data are coded in the order of I15, P18, B16, B17, P21, B19, B20, . . . . Reference numeral 802 denotes a reference list (List_0) which contains temporarily encoded/decoded pictures. For example, when inter prediction is performed using a picture P21 (P picture whose display order is 21), pictures which have been encoded and decoded in the list are referred to. In this example, P06, P09, P12, I15, and P18 are contained in the list. In inter prediction, a motion vector having an optimal predicted value is obtained from reference pictures in the list, and encoded for each macroblock. Pictures in the list are sequentially given reference picture numbers (separately from numbers shown in FIG. 8), and discriminated from each other
After the end of encoding P21, P21 is newly decoded and added to the reference list. The oldest reference picture (in this case, P06) is deleted from the reference list. Coding proceeds for B19, B20, and P24. FIG. 9 shows the state of the reference list at this time.
FIG. 10 shows a change of the reference list for each picture. In FIG. 10, pictures during coding and the contents of List_0 and List_1 are shown from top to bottom in the order of pictures to be encoded.
When a P picture (or I picture) is encoded as shown in FIG. 10, the reference list is updated to delete the oldest picture from the list. In this example, List_1 has only one picture in order not to refer to excessively distant backward pictures because backward reference of many pictures increases the buffer amount till decoding.
In this example, pictures used for reference are I and P pictures, which are sequentially added to the reference list.
In List_1, the number of pictures used for backward prediction is only one. This is merely an example of the picture structure which is supposed to be a most-used one. H.264 itself has a high degree of freedom by the configuration of the reference list. For example, not all I and P pictures need be added to the reference list, and B pictures can also be added to the reference list. Further, a long-term reference list which keeps pictures in the reference list till reception of an explicit instruction is also defined. FIG. 11 shows a change of the reference list when no picture P24 is used in the reference list.
FIG. 12 shows a state in which a macroblock of 16×16 pixels can be divided into finer macroblock partitions in H.264 inter prediction. For the divided macroblock partitions, motion vectors can be obtained by referring to independent reference pictures. An 8×8 macroblock partition can be divided into finer sub-macroblock partitions. The sub-macroblock partitions refer to the same reference picture, but their motion vectors are independently obtained. A configuration capable of changing the block size of motion compensation is also shown in FIG. 27 in Japanese Patent Laid-Open No. 2005-5844.
The H.264 standard defines the structure and update method of the reference list and the like, but does not specify a reference picture to be updated and its update time. Even a picture of a high reference frequency in the reference list may be deleted in update only because it is an old picture.
For example, as shown in FIG. 13, when a picture P21 is to be encoded, a picture P09 in the reference list is abnormal (for example, it is an image instantaneously when the flash emits light), and less used for prediction, and an older picture P06 is referred to more frequently. Even in this case, the oldest picture P06 is deleted in updating the list, and the less referred picture P09 remains. The number of substantially referred pictures decreases in the reference list, and the coding efficiency cannot be maximized.