Among image processing technologies for moving images, one transcodes first encoding information obtained by encoding image information with an inter-prediction (an inter-picture prediction) into second encoding information by using a motion vector of the first encoding information. In such a technology, when the motion vector of the first encoding information cannot be reused, for example, the second encoding information is generated by switching a prediction mode of an encoding target block that is to be encoded from the inter-prediction to an intra-prediction (an intra-picture prediction) based on a prediction mode of an adjacent block. Another of the technologies selects a prediction mode of an encoding-target macro block, from among the same prediction modes as those of an adjacent encoded macro block and a macro block in the last image corresponding to an adjacent macro block that is not encoded. Still another of the technologies estimates a motion vector for a frame or a field to be interpolated, by using a motion vector between frames or fields and generates a pixel of the frame or the field by using the estimated motion vector.
In a system for encoding a moving image by using inter-prediction, the following process is performed by a transmission-side apparatus that transmits moving image data. Motion vector data representing a motion from a past reference image to an encoding target image to be encoded is generated. A predicted image of the encoding target image is generated from the reference image by using the motion vector data. Differential data differing between the predicted image and the actual encoding target image is generated. The differential data and the motion vector data are then encoded and transmitted. At a reception-side apparatus, the decoding target image to be decoded is reproduced by using the received motion vector data and the differential data. The encoding and decoding processes in this case are performed in units of blocks obtained by dividing an original image of one frame into multiple blocks (macro blocks).
Among methods of displaying a moving image is an interlacing method in which an image of one frame is divided into a field constituted by odd-numbered scanning lines and a field constituted by even-numbered scanning lines and these fields are displayed in an alternating manner. In the interlacing method, there are multiple candidate reference images for an encoding target image, and a suitable reference image is selected from among the candidate reference images. The candidate reference images are assigned index numbers, respectively. In an image of one frame, a field located spatially on the upper side may be referred to as “top field”, and a field located spatially on the lower side may be referred to as “bottom field”.
FIG. 9 is an explanatory diagram of a procedure of assigning an index number to a candidate reference image. For example, as depicted in FIG. 9, when an image 1 of an encoding target block, which is denoted by Pt0, is a top field image, “0” is assigned as an index number refIdxL0 to a candidate reference image 2, which is closest and of the same field type, that is, the top field, and is denoted by Pt1. An index number of “1” is assigned to a candidate reference image 3, which is closest and of a different field type with respect to the image 1 of the encoding target block, that is, the bottom field, and is denoted by Pb1. The symbols of Ptn and Pbn indicate images of the top field and the bottom field, respectively. The images denoted by Ptn and the images denoted by Pbn are included in the same frame.
Fields located spatially on the same side, either upper or lower side, such as one top field and another top field, may be said to be “identical in parity”, and fields located spatially on differing sides, such as a top field and a bottom field, may be said to “differ in parity”. An index number of “2” is assigned to a candidate reference image 4, which is the next closest image identical in parity and is denoted by Pt2. An index number of “3” is assigned to a candidate reference image 5, which is the next closest image differing in parity and is denoted by Pb2. In this manner, the index numbers are alternately assigned to an image of the same parity and an image of a different parity, beginning with an image of the same parity and in ascending order of the distance from the image 1 of the encoding target block to the images assigned the index numbers. The same is true for a case where the image 1 of the encoding target block is an image of the bottom field.
When a moving image is encoded by using inter-prediction in the interlacing method, for example, as depicted in FIG. 9, multiple candidate reference images may exist for the image Pt0 of the encoding target block, such as the candidate reference images Pt1, Pb1, Pt2, and Pb2. In an actual encoding process, a suitable candidate reference image is selected from among the candidate reference images. Therefore, a reference index indicating the selected reference image is encoded together with the differential data and the motion vector data. On the other hand, an algorithm may be determined in advance such that a closest image identical in parity is set as the reference image. When an algorithm on an encoding side and an algorithm on a decoding side are this type of algorithm, the reference index is implicitly set to “0” on the decoding side even without any notification of the reference index from the encoding side to the decoding side. Therefore, as the encoding can be performed without including the reference index, the encoding efficiency is improved as compared to the case of including the reference index in the encoding.
In one of the standards of a moving-image encoding system, ITU-T H.264/ISO/IEC MPEG-4AVC, a macro block referred to as “P8×8ref0” or “P_SKIP”, which performs encoding without including any reference index, is provided. When encoding a moving image, since the compression ratio of the moving image data is increased, it is desired to select P8×8ref0 or P_SKIP as much as possible. Note that ITU-T stands for International Telecommunication Union Telecommunication Standardization Sector, ISO stands for International Organization for Standardization, IEC stands for International Electrotechnical Commission, MPEG-4 stands for Moving Picture Experts Group phase 4, and AVC stands for Advanced Video Coding. For examples, refer to Japanese Laid-Open Patent Publication Nos. 2006-295734, 2009-55542, and 2003-163894.
However, with the conventional encoding technology in the interlacing method, as described below, it is difficult to select a macro block type configured to perform encoding without including the reference index, which causes an issue that the encoding efficiency is low. In most cases of still scenes, the image of the encoding target block is the same as the closest image identical in parity in both the top field and the bottom field. Therefore, the closest image identical in parity is likely to be selected as the reference image. When the closest image identical in parity is selected, as the reference index is “0”, P8×8ref0 or P_SKIP mentioned above tends to be selected.
On the other hand, in most cases of scenes with motion, the image of the encoding target block often becomes different from the closest image identical in parity, in both the top field and the bottom field. As depicted in FIG. 10, this feature is conspicuous when a Group Of Pictures (GOP) structure is an I picture structure (not depicted in FIG. 10) or an IBBP structure in which two B pictures are sandwiched by two P pictures. In a motion compensating prediction, an image of the same picture type as the image of the encoding target block is employed as the reference image.
Therefore, when an image 11 of the encoding target block is an image Pb0 of the P picture in the bottom field, an image of the P picture closest to the image Pb0 in the bottom field is an image 12 denoted by Pb1. As the image Pb1 is apart from the image Pb0 by a time corresponding to a six-field period, a picture is likely to change greatly while making a transition from the image Pb1 to the image Pb0.
Meanwhile, a closest image 13 of the P picture in the top field, which is denoted by Pt0, is apart from the image Pb0 by only a time corresponding to a one-field period. Therefore, a change of a picture in this case is smaller than the mentioned case above, where the image Pb1 is apart from the image Pb0 by the time corresponding to a six-field period. Although it is not the same as the example of the image in the bottom field, even when the image of the encoding target block is an image in the top field, the closest image differing in parity has a time by which the image is apart from the image of the encoding target block shorter than that of the closest image identical in parity. Therefore, in the case of the scenes with motion, selection of the closest image differing in parity as the reference image provides a high possibility of improving the accuracy of the prediction, the closest image differing in parity is likely to be selected as the reference image. In this case, because the reference index is “1”, P8×8ref0 or P_SKIP mentioned above is not selected.
The problem described above not only occurs in a case of a scene without motion and a scene with motion, but also occurs in a case of an area without motion (a quiescence area) and an area with motion (a moving area) in a picture in the same manner. That is, as depicted in FIG. 11, in a quiescence area 21, the closest image identical in parity is likely to be selected as the reference image, and in a moving area 22, the closest image differing in parity is likely to be selected as the reference image. Therefore, in the moving area 22, P8×8ref0 or P_SKIP mentioned above is not selected. In this manner, in the conventional encoding technology, as the macro block type configured to perform encoding without including the reference index is hardly selected, the encoding efficiency decreases.