In general, an image or a video is one of information transmitted in a communication system, or information recorded in an accumulation device. In the related art, a technology of coding an image for transmitting or accumulating an image (including a video in the following descriptions) is known.
As a video coding method, AVC (H.264/MPEG-4 Advanced Video Coding) and High-Efficiency Video Coding (HEVC) which are an advanced coding method is known (NPL 1).
In the video coding method, generally, a predicted image is generated based on a locally-decoded image obtained by coding/decoding an input image. A prediction residual obtained by removing the generated predicted image from the input image (original image) is coded. As a generation method of the predicted image, inter-frame prediction (inter-prediction) and intra-frame prediction (intra-prediction) are exemplified.
In the intra-prediction, a predicted image for a picture is sequentially generated based on a local decoding image in the same picture.
In the inter-prediction, a predicted image is generated by motion compensation between pictures. A decoded picture which is used when a predicted image is generated is referred to as a reference picture in the inter-prediction.
A technology in which a plurality of videos which are associated with each other is divided into layers (level layers), and coding is performed, and thus coding data is generated from the plurality of videos is also known. This technology may be referred to as a hierarchy coding technology. Coding data generated by using the hierarchy coding technology may be also referred to as hierarchy coding data.
As the representative hierarchy coding technology, Scalable HEVC (SHVC) which uses HEVC as a base is known (NPL 2).
In the SHVC, spatial scalability, temporal scalability, and SNR scalability are supported. For example, in a case of the spatial scalability, a plurality of videos which have different resolution is divided into layers, and coding is performed so as to generate hierarchy coding data. For example, an image obtained from an original image by performing down-sampling so as to have desired resolution is coded as a lower layer. Then, coding is performed as a higher layer in a state where inter-layer prediction is applied to the original image in order to remove redundancy between layers.
As the other representative hierarchy coding technology, Multi View HEVC (MV-HEVC) which uses HEVC as a base is known. In the MV-HEVC, view scalability is supported. In the view scalability, a plurality of videos which respectively correspond to different viewpoints (views) is divided into layers, and coding is performed so as to generate hierarchy coding data. For example, a video corresponding to a viewpoint (base view) which is used as a base is coded as a lower layer. Then, a video corresponding to a different viewpoint is coded as a higher layer in a state of applying the inter-layer prediction.
As the inter-layer prediction in the SHVC or the MV-HEVC, inter-layer image prediction and inter-layer motion prediction are provided. In the inter-layer image prediction, a predicted image is generated by using a decoding image of a lower layer. In the inter-layer motion prediction, a prediction value of motion information is derived by using the motion information of a lower layer. A picture used in prediction in the inter-layer prediction is referred to as an inter-layer reference picture. A layer including the inter-layer reference picture is referred to as a reference layer. In the following descriptions, a reference picture used in the inter-prediction and a reference picture used in the inter-layer prediction are collectively simply referred to as a reference picture.
The inter-layer image prediction includes reference pixel position deriving processing, and scale deriving processing. In the reference pixel position deriving processing, a pixel position on a lower layer, which corresponds to a position of a prediction target pixel on a higher layer is derived. In the scale deriving processing, a scale corresponding to magnification in extension processing applied to a picture of a lower layer is derived.
In the SHVC or the MV-HEVC, any of the inter-prediction, the intra-prediction, and the inter-layer image prediction can be used for generating a predicted image.
As one application of using the SHVC or the MV-HEVC, an image application considering an interest region is provided. For example, in an image reproduction terminal, generally, an image of the entire region is reproduced at relatively low resolution. In a case where a viewer of the image reproduction terminal designates a portion of a displayed image as an interest region, the designated interest region is displayed at high resolution in the reproduction terminal.
Such an image application considering an interest region can be realized by using hierarchy coding data. The hierarchy coding data is obtained in such a manner that coding is performed by setting an image which has relatively low resolution in the entire region, as coding data of a lower layer, and by setting an image which has high resolution in an interest region, as coding data of a higher layer. That is, in a case where the entire region is reproduced, only coding data of a lower layer is decoded and reproduced. In a case where an image which has high resolution in an interest region is reproduced, coding data of a higher layer is added to the coding data of the lower layer, and transmission is performed. Thus, the application can be realized in a transmission band which is narrower than that in a case where both pieces of coding data for a low resolution image, and coding data for a high resolution image are transmitted. At this time, coding data corresponding to a region which includes an interest region is extracted from the higher layer and the lower layer, and the extracted coding data is transmitted, and thus it is possible to suppress the transmission band.
In such an image application considering an interest region, in a case where coding data which includes an interest region and has a higher layer and a lower layer is generated, a positional relation between a pixel of the higher layer and a pixel of the lower layer is changed. As a result, there is a problem in that accuracy in prediction in a case where a pixel value of the higher layer is predicted based on a pixel value of the lower layer is degraded.
NPL 3 discloses a method in which additional information indicating a position of an alternative picture on the lower layer is transmitted, and a reference pixel position or a scale is calculated by using the additional information, and thus the reference pixel positions (corresponding reference positions) before and after extraction are equal to each other, or the scales before and after extraction are equal to each other even in a case where partial data corresponding to an interest region is extracted from hierarchy coding data.