In the video coding scheme based on the method described in Non Patent Literature (NPL) 1, each frame of digitized video is split into coding tree units (CTUs), and each CTU is coded in raster scan order. Each CTU is split into coding units (CUs) and coded, in a quadtree structure. Each CU is split into prediction units (PUs) and predicted. The prediction error of each CU is split into transform units (TUs) in a quadtree structure and frequency-transformed.
A CU is a unit of coding in intra prediction/inter-frame prediction. Intra prediction and inter-frame prediction are described below.
Intra prediction (intra-frame prediction) is a prediction that generates a prediction signal from a reconstructed image of a frame to be coded. NPL 1 defines, for example, 33 types of angular intra prediction depicted in FIG. 14. In angular intra prediction, a reconstructed pixel around a block to be encoded is extrapolated in any of 33 directions depicted in FIG. 14, to generate an intra prediction signal.
In addition to angular intra prediction, DC prediction and planar prediction are specified as intra prediction. In DC prediction, a mean value of a reference image is used as the prediction values of all pixels in a TU to be predicted. In planar prediction, a prediction image is generated by linear interpolation from pixels in a reference image.
Inter-frame prediction is a prediction based on an image of a reconstructed frame (reference picture) different in display time from a frame to be coded. Inter-frame prediction is also referred to as inter prediction. In inter prediction, an inter prediction signal is generated based on a reconstructed image block of a reference picture (using pixel interpolation if necessary).
Referring next to FIG. 15, the configuration and operation of a general video coding device that inputs each CU of each frame of digitized video as an input image and outputs a bitstream will be described.
A video coding device 100A depicted in FIG. 15 includes a frequency transformer 101, a quantizer 102, an entropy encoder 103, an inverse frequency transformer/inverse quantizer 104, a buffer 105, an intra predictor 1060, an inter predictor 107, and a switch 110.
The intra predictor 1060 and the inter predictor 107 each generate a prediction signal for the input image signal of the CU. The intra predictor 1060 generates the prediction signal based on intra prediction. The inter predictor 107 generates the prediction signal based on inter prediction.
A prediction image supplied from the intra predictor 1060 or the inter predictor 107 via the switch 110 is subtracted from an image input to the video coding device 100A so that the input image becomes a prediction error image, and then the prediction error image is supplied to the frequency transformer 101.
The frequency transformer 101 frequency-transforms the prediction error image obtained by subtracting the prediction signal from the input image signal.
The quantizer 102 quantizes the frequency-transformed prediction error image (coefficient image). The entropy encoder 103 entropy-codes prediction parameters and the coefficient image, and outputs a bitstream.
The inverse frequency transformer/inverse quantizer 104 inverse-quantizes the coefficient image. The inverse frequency transformer/inverse quantizer 104 further inverse-frequency-transforms the inverse-quantized coefficient image. The prediction signal is added to the reconstructed prediction error image obtained by the inverse frequency transform, and the result is supplied to the buffer 105. The buffer 105 stores the reconstructed image.
The bitstream output from the video coding device is transmitted to a video decoding device. The video decoding device performs a decoding process to reconstruct the video image. FIG. 16 is a block diagram depicting an example of the structure of a general video decoding device that decodes the bitstream output from the general video coding device to obtain decoded video. The following describes the structure and operation of the general video decoding device with reference to FIG. 16.
A video decoding device 200A depicted in FIG. 16 includes an entropy decoder 203, an inverse frequency transformer/inverse quantizer 204, a buffer 205, an intra predictor 2060, an inter predictor 207, and a switch 210.
The entropy decoder 203 entropy-decodes the input bitstream. The entropy decoder 203 supplies the quantized coefficient image to the inverse frequency transformer/inverse quantizer 204, and the prediction parameters to the switch 210.
The inverse frequency transformer/inverse quantizer 204 inverse-quantizes the input quantized coefficient image, and outputs the result as the coefficient image. The inverse frequency transformer/inverse quantizer 204 further converts the coefficient image from the frequency domain to the spatial domain, and outputs the result as the prediction error image. The prediction error image is added to the prediction image supplied from the switch 210 to be a decoded image. The decoded image is output from the video decoding device 200A as an output image, and also supplied to the buffer 205 and the intra predictor 2060.
The buffer 205 stores previously decoded images as reference images. The intra predictor 2060 predicts the decoded image based on a reconstructed image previously decoded at the same position. The intra predictor 2060 thus generates the prediction image. The inter predictor 207 generates the prediction image based on a reference image supplied from the buffer 205.
The following describes the luma component (luminance signal: luma signal) and color difference component (color difference signal: chroma signal) of an image.
Each CTU is made up of a coding tree block (CTB) of a luma component and CTBs of chroma components corresponding to the luma component. In High Efficiency Video Coding (HEVC), 4:2:0, 4:2:2, and 4:4:4 depicted in FIG. 17 are each specified as the resolution of the luma component and chroma components. In FIG. 17, N denotes the number of pixels. As depicted in FIG. 17, in 4:2:0, the number of pixels of each of the U component and V component of the chroma signal is ½ of the number of pixels of the luma component Y in the horizontal direction and the vertical direction. In 4:2:2, the number of pixels of each of the U component and V component of the chroma signal is ½ of the number of pixels of the luma component Y in the horizontal direction. In 4:4:4, the number of pixels of each of the U component and V component of the chroma signal is the same as the number of pixels of the luma component Y in the horizontal direction and the vertical direction. In High Efficiency Video Coding (HEVC), the prediction mode (prediction direction) of intra prediction for the U component and the prediction mode of intra prediction for the V component are the same.
In HEVC, the video coding device can signal whether or not the prediction mode of the chroma components is the same as the prediction mode of the top left luma PU in the CU. Thus, the video coding device can prediction-code the chroma components based on the intra prediction mode of the luma component. The video coding device may apply a predetermined prediction mode to the luma component, in the case where the prediction mode of the luma component and the prediction mode of the chroma components are not the same.