The present technology relates to an encoding apparatus, an encoding method, a decoding apparatus, and a decoding method, and more specifically to an encoding apparatus, an encoding method, a decoding apparatus, and a decoding method, which are configured to reduce the amount of information concerning information specifying a reference picture.
In recent years, apparatuses that digitally handle image information and that comply with Moving Picture Experts Group phase (MPEG) or a similar scheme for compressing the image information using an orthogonal transform such as a discrete cosine transform (DCT) and motion compensation by utilizing redundancy specific to the image information for the purpose of efficient transmission and accumulation of the information have become increasingly prevalent for use in distributing information from broadcast stations and the like and receiving information in general households.
Particularly, MPEG-2 (International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC) 13818-2) is defined as a general-purpose image coding scheme, and is a standard that covers not only interlaced scanned images and progressive scanned images but also standard-definition images and high-definition images as well. MPEG-2 is currently widely used in a wide range of applications for both professional and consumer uses. The MPEG-2 scheme allows a high compression ratio and satisfactory image quality by, for example, allocating a code amount (bit rate) of 4 to 8 Mbps to a standard-resolution interlaced scanned image having 720×480 pixels or 18 to 22 Mbps to a high-resolution interlaced scanned image having 1920×1088 pixels.
MPEG-2, which is mainly used for high-quality coding suitable for broadcasting, does not support coding schemes with a lower code amount (bit rate), or a higher compression ratio, than MPEG-1. With the widespread use of mobile terminals, the demand for such coding schemes is expected to increase, and the standardization of the MPEG-4 coding scheme has been initiated accordingly. The MPEG-4 image coding scheme standard, ISO/IEC 14496-2, was accepted as an international standard in December 1998.
In addition, standardization of H.26L (International Telecommunication Union Telecommunication Standardization Sector (ITU-T) Q6/16 Video Coding Expert Group (VCEG)), initially for the purpose of image coding for videoconferences, has progressed in recent years. In general, H.26L provides a higher coding efficiency than existing coding schemes such as MPEG-2 and MPEG-4 although it involves a large amount of computation for encoding and decoding.
Additionally, a standard based on H.26L, which incorporates functionality not supported by H.26L and provides a higher coding efficiency, called Joint Model of Enhanced-Compression Video Coding, is currently being developed as part of the MPEG-4 standardization activity. This standard was internationally standardized in March 2003 under the name H.264 and MPEG-4 Part 10 (Advanced Video Coding (AVC)).
In addition, standardization of an extension of the standard described above, named Fidelity Range Extension (FRExt), which includes coding tools for business use, such as RGB, 4:2:2, and 4:4:4, and 8×8 DCT and quantization matrices, which are defined by MPEG-2, was completed in February 2005. Accordingly, AVC can be used as a coding scheme that allows even film noise included in movies to be well displayed, and has come to be used in a wide range of applications such as Blu-Ray Disc (registered trademark).
However, there has recently been an increasing demand for a further increase in the compression ratio used in coding, such as a demand for compression of images having 4000×2000 pixels, which is four times as high as the number of pixels of high-definition images, or a demand for distribution of high-definition images in an environment with limited transmission capacity, such as the Internet. To this end, the VCEG under ITU-T is continuing to study enhancement of coding efficiency.
In the High Efficiency Video Coding (HEVC) scheme, a sequence parameter set (SPS) includes a short-term reference picture set (hereinafter referred to as an “RPS”), which is used by a decoding apparatus to identify reference picture specification information specifying a reference picture (see, for example, Benjamin Bross, Woo-Jin Han, Jens-Rainer Ohm, Gary J. Sullivan, Thomas Wiegand, “High efficiency video coding (HEVC) text specification draft 7”, JCTVC-I1003_d4, 2012.4.27-5.7).
FIG. 1 is a diagram illustrating an example of the syntax of an RPS.
As given in the second line in FIG. 1, an RPS includes inter_ref_pic_set_prediction_flag. The inter_ref_pic_set_prediction_flag is reference information for a picture being encoded, indicating whether reference picture specification information specifying a reference picture of a preceding picture that is a picture preceding the picture being encoded in a group of pictures (GOP) in encoding order is to be used as reference picture specification information specifying a reference picture of the picture being encoded.
The inter_ref_pic_set_prediction_flag is equal to 1 if the reference picture specification information specifying the reference picture of the preceding picture is to be used as the reference picture specification information specifying the reference picture of the picture being encoded. The inter_ref_pic_set_prediction_flag is equal to 0 if the reference picture specification information specifying the reference picture of the preceding picture is not to be used as the reference picture specification information specifying the reference picture of the picture being encoded.
As given in the third and fourth lines in FIG. 1, if the inter_ref_pic_set_prediction_flag is equal to 1, the RPS includes delta_idx_minus1, which is preceding picture specification information specifying a preceding picture. The delta_idx_minus1 is specifically a value obtained by subtracting 1 from a value obtained by subtracting the coding number of the preceding picture from the coding number (coding order) of the picture being encoded. The term “coding number”, as used herein, refers to a number assigned to each of pictures in the GOP in encoding order, starting from the smallest value.
Further, as given in the thirteenth to twenty-third lines in FIG. 1, if the inter_ref_pic_set_prediction_flag is equal to 0, the RPS includes reference picture specification information.
FIG. 2 is a diagram illustrating an example of inter_ref_pic_set_prediction_flag and delta_idx_minus1.
In the example in FIG. 2, the reference picture specification information for the picture being encoded that is assigned the coding number N is identical to the reference picture specification information for the preceding picture that is assigned the coding number N−1 and immediately precedes the picture being encoded in encoding order.
In this case, the inter_ref_pic_set_prediction_flag is set to 1, which indicates that the reference picture specification information for the preceding picture is to be used as the reference picture specification information for the picture being encoded. Further, the delta_idx_minus1 is set to 0, which is a value obtained by subtracting the coding number N−1 of the preceding picture from the coding number N of the picture being encoded to obtain the value 1 and further subtracting 1 from the obtained value 1.