A video coding system based on a system described in Non Patent Literature (NPL) 1 divides each frame of digitized video into coding tree units (CTUs), and each CTU is encoded in raster scan order. Each CTU is split into coding units (CUs) in a quadtree structure and encoded. Each CU is split into prediction units (PUs) and predicted. Further, a prediction error of each CU is split into transform units (TUs) in a quadtree structure and transformed.
The CU is a coding unit of intra prediction/inter-frame prediction. Intra prediction and inter-frame prediction will be described below.
Intra prediction is prediction from a reconstructed image of a frame to be encoded. In the system described in NPL 1, a reconstructed pixel around a block to be encoded is extrapolated to generate an intra prediction signal. Hereinafter, a CU using intra prediction is referred to as an intra CU. Note that the value of pred_mode_flag syntax of the intra CU is 1 in NPL 1.
Further, an intra CU that does not use intra prediction is referred to as an I_PCM (Intra Pulse Code Modulation) CU. In the I_PCM CU, an image of the CU is transmitted intact instead of transmitting a prediction error of the CU. Note that the value of pcm_flag syntax of the I_PCM CU is 1 in NPL 1.
Inter-frame prediction is prediction based on an image of a reconstructed frame (reference picture) different in display time from a frame to be encoded. Hereinafter, inter-frame prediction is also referred to as inter prediction. FIG. 10 is an explanatory diagram depicting an example of inter-frame prediction. A motion vector MV=(mvx, mvy) indicates the amount of translation of a reconstructed image block of a reference picture relative to a block to be encoded. In inter prediction, an inter prediction signal is generated based on a reconstructed image block of a reference picture (using pixel interpolation if necessary). Hereinafter, a CU using inter prediction is referred to as an inter CU. Note that the value of pred_mode_flag syntax of the inter CU is 0 in NPL 1.
An inter CU using inter-frame prediction not to transmit motion vector difference information and a CU prediction error is referred to as a Skip CU. In NPL 1, the value of skip_flag syntax of the Skip CU is 1.
A frame encoded with only intra CUs mentioned above is called an I frame (or an I picture). A frame encoded including inter CUs as well as intra CUs is called a P frame (or a P picture). A frame encoded including inter CUs for which not only one reference picture but two reference pictures are simultaneously used for inter prediction of a block is called a B frame (or a B picture).
Intra prediction and inter-frame prediction are as described above.
Referring to FIG. 11, the configuration and operation of a typical video encoding device that receives each CU of each frame of digitized video as an input image and outputs a bitstream will be described.
The video coding device depicted in FIG. 11 includes a transformer/quantizer 102, an entropy encoder 103, an inverse transformer/inverse quantizer 104, a buffer 105, a predictor 106, a PCM encoder 107, a PCM decoder 108, a multiplexed data selector 109, a multiplexer 110, a switch 121, and a switch 122.
As depicted in FIG. 12, a frame is made up of LCUs (Largest Coding Units). An LCU is made up of CUs (Coding Units). FIG. 12 is an explanatory diagram depicting an example of CTU splitting of a frame t and an example of CU splitting of CTU8 in the frame t when the spatial resolution of the frame is CIF (Common Intermediate Format) and the CTU size is 64. A quadtree structure of CTU8 can be represented by cu_split_flag=1 at CUDepth=0 indicating that a 64×64 region is split, three cu_split_flag=0 at CUDepth=1 indicating that first three 32×32 CUs (CU0, CU1, and CU2) are not split, cu_split_flag=1 at CUDepth=1 indicating that the last 32×32 CU is split, three cu_split_flag=0 at CUDepth=2 indicating that first three 16×16 CUs (CU3, CU4, and CU5) are not split, cu_split_flag=1 at CUDepth=2 indicating that the last 16×16 CU is split, and four cu_split_flag=0 at CUDepth=3 indicating that all 8×8 CUs (CU6, CU7, CU8, and CU9) are not split.
The video encoding device depicted in FIG. 11 encodes LCUs in raster scan order, and encodes CUs constituting each LCU in z-scan order. The size of a CU is any one of 64×64, 32×32, 16×16, and 8×8. The smallest CU is referred to as a smallest coding unit (SCU).
The transformer/quantizer 102 frequency-transforms an image (prediction error image) from which a prediction signal is subtracted to obtain a frequency transform coefficient of a prediction error image.
The transformer/quantizer 102 further quantizes the frequency transform coefficient with a predetermined quantization step size Qs. Hereinafter, the quantized frequency transform coefficient is referred to as a coefficient quantization value or a quantization level value.
The entropy encoder 103 entropy-encodes a prediction parameter and the quantization level value. The prediction parameter is information related to information on the prediction type (intra prediction or inter prediction) of a CU mentioned above and PUs (Prediction Units) included in the CU.
The inverse transformer/inverse quantizer 104 inverse-quantizes the quantization level value with the quantization step size Qs. The inverse transformer/inverse quantizer 104 further inverse-frequency-transforms the frequency transform coefficient obtained by the inverse quantization. The prediction signal is added to a reconstructed prediction error image obtained by the inverse transform, and the reconstructed prediction error image is supplied to the switch 122.
The multiplexed data selector 109 monitors the amount of input data of the entropy encoder 103 corresponding to a CU to be encoded. When the entropy encoder 103 can entropy-encode the input data within the processing time of the CU, the multiplexed data selector 109 selects output data of the entropy encoder 103, and supplies the output data to the multiplexer 110 through the switch 121. The multiplexed data selector 109 further selects output data of the inverse transformer/inverse quantizer 104, and supplies the output data to the buffer 105 through the switch 122.
When the entropy encoder 103 cannot entropy-encode the input data within the processing time of the CU, the multiplexed data selector 109 selects output data of the PCM encoder 107, and supplies the output data to the multiplexer 110 through the switch 121. The multiplexed data selector 109 further selects output data obtained by the PCM decoder 108 that PCM-decodes the output data of the PCM encoder 107, and supplies the output data to the buffer 105 through the switch 122.
The buffer 105 stores a reconstructed image supplied through the switch 122. A reconstructed image for one frame is referred to as a reconstructed picture.
The multiplexer 110 multiplexes output data of the entropy encoder 103 and the PCM encoder 107, and outputs the multiplexed output data.
Based on the above-mentioned operation, the multiplexer 110 in the video encoding device generates a bitstream.