In image coding techniques for multiple-viewpoint images, disparity prediction coding that reduces the amount of information by predicting disparity between images at the time of coding of multiple-viewpoint images and decoding methods corresponding to the coding methods have been proposed. A vector indicating disparity between viewpoint images is referred to as a disparity vector. A disparity vector is a 2-dimensional vector that has a component (x component) in the horizontal direction and a component (y component) in the vertical direction and is calculated for each block which is a region obtained by splitting one image. To acquire multiple-viewpoint images, it is general to use cameras disposed at respective viewpoints. In coding for multiple-viewpoint images, viewpoint images are coded as different layers in a plurality of layers. A coding method for a moving image formed in a plurality of layers is generally referred to as scalable coding or hierarchical coding. In scalable coding, high coding efficiency is realized by executing prediction between layers. A layer serving as a standard layer, which is not used in prediction between layers, is referred to as a base layer and other layers are referred to as enhancement layers. In a case where layers are formed from viewpoint images, scalable coding is referred to as view scalable coding. At this time, a base layer is also referred to as a base view and an enhancement layer is also referred to as a non-base view. Further, in addition to view scalable coding, scalable coding is referred to as 3-dimensional scalable coding in a case where layers are formed from a texture layer (image layer) and a depth layer (distance image layer).
In scalable coding, there are spatial scalable coding (in which a picture with a low resolution is processed as a base layer and a picture with a high resolution is processed in an enhancement layer), SNR scalable coding (in which a picture with a low resolution is processed as a base layer and a picture with a high resolution is processed in an enhancement layer), and the like as well as view scalable coding. In the scalable coding, for example, a picture of the base layer is used as a reference picture in coding for a picture of an enhancement layer in some cases.
In NPL 1, there is known a technique called viewpoint synthesis prediction in which a predicted image with a high precision is obtained by splitting a prediction unit into small sub-blocks and executing prediction using a disparity vector for each sub-block. In NPL 1, there is a technique called residual prediction in which a residual is predicted using an image of a different view from a target view and is added. In NPL 1, there is known a technique for deriving enhancement merge candidates such as inter-view merge candidates.