An example of either information transmitted in a communication system or information recorded in a storage device is an image or a moving image. In the related art, in order to transmit and store the image (hereinafter, including the moving image), a technology of encoding the image has been known.
As moving image coding schemes, H.264/MPEG-4, AVC, and high-efficiency video coding (HEVC) that is succeeding codec have been known (NPL 1).
In the moving image coding schemes, normally, a predicted image is generated based on a local decoded image obtained by encoding/decoding an input image, and a prediction residual (also referred to as “a difference image” or “a residual image”) obtained by subtracting the predicted image from the input image (an original image) is encoded. Further, examples of a predicted image generation method include an inter-frame prediction (inter-prediction) and an intra-frame prediction (intra-prediction).
In the intra-prediction, a predicted image of a frame is sequentially generated, based on the local decoded image in the same frame.
In the inter-prediction, a predicted image is generated by inter-frame motion compensation. In most cases, information (a motion compensation parameter) regarding the motion compensation is not directly encoded in order to reduce an encoding amount. Thus, in the inter-prediction, the estimation of the motion compensation parameter is performed, based on the decoding situation around the target block.
For example, in the HEVC, a list of motion compensation parameter candidates (merge candidates) is generated in a prediction unit of a merge mode, and the motion compensation of the predicted image is performed by using the merge candidate that is selected as an index from the list. The list of the merge candidates includes spatial candidates which are derived based on the motion information of an adjacent region. During the derivation of the spatial candidates, the adjacent regions are selected from the regions located in the upper left part, the upper right part, and the lower right part of a prediction unit which is a decoding target.
Meanwhile, the motion compensation is performed by generating a list of the motion compensation parameter candidates (prediction motion vector candidates), and deriving the motion compensation parameter from the motion compensation candidate which is selected as an index from the list and a difference motion vector, in the prediction unit other than that of the merge mode.
Further, in recent years, a hierarchical encoding technology for hierarchically encoding an image according to a required data rate has been proposed.
Examples of the hierarchical encoding method include an H.264/AVC Annex G Scalable Video Coding (SVC) as the standard of ISO/IEC and ITU-T.
The SVC supports spatial scalability, temporal scalability, and SNR scalability. For example, in a case of the spatial scalability, an image obtained by down-sampling an original image to a desired resolution is encoded as a lower layer in the H.264/AVC. Next, the inter-layer prediction is performed in an upper layer in order to remove redundancy between layers.
Examples of the inter-layer prediction include motion information prediction that predicts information regarding the motion prediction, from information of the lower layer at the same time, or texture prediction that performs prediction from an image obtained by up-sampling a decoded image of the lower layer at the same time (NPL 2). In the motion information prediction, motion information is encoded, with the motion information of a reference layer as an estimation value.