An encoding standard of encoding a plurality of viewpoint images such 3D (Dimension) images includes, for example, a MVC (Multiview Video Coding) standard obtained by extending an AVC (Advanced Video Coding) (H.264/AVC) standard.
According to the MVC standard, an encoding target image is a color image which has as a pixel value a value corresponding to light from a subject, and each of a plurality of viewpoint color images is encoded also referring to other viewpoint color images in addition to the this viewpoint color image where necessary.
That is, according to the MVC standard, one viewpoint color image among a plurality of viewpoint color images is a base view image, and the other viewpoint color images are dependent view images.
Further, the base view color image is encoded referring only to this base view image, and the dependent view color image is encoded also referring to other view images in addition to this dependent view image where necessary.
That is, the dependent view color image is subjected to disparity prediction of generating a predicted image referring to other view color images where necessary, and is encoded using this predicted image.
Hereinafter, a given viewpoint #1 color image is a base view image, and another viewpoint #2 color image is a dependent view image.
According to the MVC standard, when the viewpoint #2 color image is subjected to disparity prediction referring to the viewpoint #1 color image and is encoded (subjected to predictive encoding) using a predicted image obtained by this disparity prediction, a disparity vector which represents a disparity of an encoding target of the viewpoint #2 color image, that is, for example, a target block as a macro block with respect to the viewpoint #1 color image is detected.
Further, according to the MVC standard, a predicted motion vector obtained by predicting a disparity vector of a target block is calculated, and a residual vector which is a difference between the disparity vector and the predicted motion vector is encoded.
According to the MVC standard, as the bit rate of the residual vector tends to be higher when the residual vector is greater, and therefore when the degree of the residual vector is less, that is, when prediction precision of a predicted motion vector is better (the predicted motion vector is more similar to a disparity vector), it is possible to improve encoding efficiency.
By the way, in recent years, as an encoding standard which adopts a disparity information image (depth image) which includes disparity information related to a disparity per pixel of each viewpoint color image as a pixel value in addition to each viewpoint color image as a plurality of viewpoint images, and encodes each viewpoint color image and each viewpoint disparity information image, for example, a standard such as MPEG3DV standard is defined.
According to the MPEG3DV standard, each viewpoint color image and each viewpoint disparity information image are principally encoded in the same way as the MVC standard.
According to the MVC standard, although a predicted motion vector (of a disparity vector) of a target block is calculated from disparity vectors of blocks in a surrounding of a target block of a color image, when the target block is at a boundary portion of a foreground object, a foreground block, a background block and a block of a portion in which occlusion occurs exist as surrounding blocks of this target block, and therefore the predicted motion vector of the target block calculated from these blocks cause deterioration of prediction precision in some cases.
Hence, a method of selecting one of disparity vectors of surrounding blocks of the target block in a color image as a predicted motion vector of the target block based on a disparity information image is proposed (see, for example, Non-Patent Document 1).