Recently, image information is handled as digital data, and, for the purpose of transmission and accumulation of information having high-efficiency at that time, devices that are in compliance with the MPEG (Moving Picture Experts Group) system or the like that performs an orthogonal transform such as a discrete cosine transform and compression using motion compensation, by using the redundancy that is unique to the image information, are widely used for both information delivery in broadcasting stations and the like and information reception in general houses.
Particularly, the MPEG2 (ISO (International Organization for Standardization)/IEC (International Electrotechnical Commission) 13818-2) system is defined as a general-purpose image coding system and is currently used widely for a broad range of applications for the professional use and the consumer use as standards covering both an interlaced scanning image and a sequential scanning image and a standard resolution image and a high definition image. By using the MPEG2 compression system, for example, a code amount (bit rate) of 4 to 8 Mbps in the case of an interlaced scanning image of a standard resolution of 720×480 pixels and a code amount (bit rate) of 18 to 22 Mbps in the case of an interlaced scanning image of high definition of 1920×1088 pixels are allocated, whereby a high compression rate and an improved image quality can be realized.
MPEG2 is targeted for high image quality coding that is mainly suitable for broadcasting but does not respond to a coding system of a code amount (bit rate) lower than that of MPEG1, in other words, a coding system of a higher compression rate. In accordance with the popularization of mobile terminals, the request for such a coding system is predicted to increase in the future, and an MPEG4 coding system has been standardized in response thereto. Relating to the image coding system, a specification has been approved in December, 1998 to be an international standard as ISO/IEC 14496-2.
Furthermore, in recent years, for the initial purpose of image coding for television conferences, H.26L (ITU-T (International Telecommunication Union Telecommunication Standardization Sector) Q6/16 VCEG (Video Coding Expert Group)) has been standardized. It is known that H.26L requires a more calculation amount due to the coding process and the decoding process thereof than that of a conventional coding system such as MPEG2 or MPEG4 and realizes higher coding efficiency. In addition, currently, as part of activities of MPEG4, a standard realizing higher coding efficiency by introducing functions not supported according to H.26L based on H.26L has been made as Joint Model of Enhanced-Compression Video Coding.
As a schedule of the standardization thereof, in March, 2003, an international standard was made based on names of H.264 and MPEG-4 Part 10 (Advanced Video Coding; hereinafter, referred to as AVC).
However, in a case where the macroblock size is set as 16 pixels×16 pixels, there is concern that it is not optimal for a large picture frame of UHD (Ultra High Definition: 4000 pixels×2000 pixels) that becomes a target for a next-generation coding system.
Thus, for the purpose of further improving the coding efficiency to be higher than that of the AVC, standardization of a coding system called HEVC (High Efficiency Video Coding) has been progressed by a JCTVC (Joint Collaboration Team-Video Coding) that is a joint standardization organization of the ITU-T and the ISO/IEC (for example, see Non-Patent Document 1).
In this HEVC coding system, a coding unit (CU) is defined as a processing unit like a macroblock in the AVC. Unlike the macroblock of the AVC, the CU is not fixed to the size of 16×16 pixels but is designated in image compression information in each sequence.
Incidentally, in other to improve the coding of a motion vector using a median prediction defined in the AVC, a method has been considered in which not only a “Spatial Predictor” but also a “Temporal Predictor” and a “Spatio-Temporal Predictor” can be candidates for predicted motion vectors.
In addition, as one of coding systems for motion information, a technique called motion partition merging in which Merge_Flag and Merge_Left_Flag are transmitted has been proposed.
However, only a process within the same viewpoint is presented, and, in the case of multi-viewpoint coding, a vector prediction over viewpoints cannot be made, and there is concern that the coding efficiency decreases.
Thus, various proposals have been made for a TMVP (Temporal motion vector prediction) of merging at the time of multi-viewpoint coding (for example, see Non-Patent Document 2).
In an invention disclosed in Non-Patent Document 2, when a reference picture type of reference picture (reference image) indicated by a reference index 0 of the current block is a Short-term, and a reference picture type of collocated block is a Long-term, a reference index other than “0” indicating a reference picture of which the reference picture type is the Long-term is selected from a list of reference images.
In addition, when a reference picture type of reference picture indicated by the reference index 0 of the current block is a Long-term, and a reference picture type of collocated block is a Short-term, a reference index other than “0” indicating a reference picture of which the reference picture type is the Short-term is selected from the list of reference images.
Accordingly, before coding of the CU (Coding Unit) level is performed, one reference index having a picture type different from the picture type of the reference index 0 needs to be found.