In general, after digitizing a moving image signal externally input, a video encoding device executes an encoding process that conforms to a predetermined video coding scheme to generate coded data, i.e. a bitstream.
As the predetermined coding scheme, there is ISO/IEC 14496-10 Advanced Video Coding (AVC) described in Non Patent Literature (NPL) 1. As a reference model of an AVC encoder, Joint Model scheme is known (hereinafter called a general video encoding device).
Referring to FIG. 8, the operation of the general video encoding device that receives each frame of digitized video as input and outputs a bitstream is described below.
The video encoding device shown in FIG. 8 includes a transformer/quantizer 102, an entropy encoder 103, an inverse transformer/inverse quantizer 104, a picture buffer 105, a decoded picture buffer 106, a quantizer/inverse quantizer 107, an adaptive linear interpolator 108, an inter-frame predictor (inter predictor) 110, an intra predictor 111, an encoding controller 112, a switch 121, and a switch 122.
The general video encoding device divides each frame into blocks of 16×16 pixel size called macro blocks (MBs), and further divides each MB into blocks of 4×4 pixel size to set the 4×4 blocks as the minimum unit of encoding.
FIG. 9 is an explanatory diagram showing an example of block division in the case where the frame has a spatial resolution of QCIF (Quarter Common Intermediate Format).
The following describes the operation of each unit shown in FIG. 8.
A prediction signal supplied from the intra predictor 111 or the inter-frame predictor 110 is subtracted from MB block-divided input video, and the result is input to the transformer/quantizer 102. The prediction signal is an intra prediction signal or an inter-frame prediction signal. The MB block from which the prediction signal is subtracted is called a prediction error image block below.
The intra predictor 111 generates the intra prediction signal using a reconstructed image stored in the picture buffer 105 and having the same display time as a current frame. The MB encoded using the intra prediction signal is called an intra MB below.
The inter-frame predictor 110 generates the inter-frame prediction signal using a reference image different in display time from the current frame and stored in the decoded picture buffer 106. The MB encoded using the inter-frame prediction signal is called an inter MB below.
A frame encoded by including only intra MBs is called an I frame. A frame encoded by including not only intra MBs but also inter MBs is called a P frame. A frame encoded by including inter MBs that use not only one reference image but two reference images simultaneously to generate the inter-frame prediction signal is called a B frame.
The encoding controller 112 compares the intra prediction signal and the inter-frame prediction signal with an input MB stored in an MB buffer, and selects a prediction signal that reduces the energy of a prediction error image block to control the switch 122. Information associated with the selected prediction signal (the intra prediction mode, the intra prediction direction, and information associated with inter-frame prediction) is supplied to the entropy encoder 103.
Based on the input MB or the prediction error image block, the encoding controller 112 also selects a base block size of integer DCT (Discrete Cosine Transform) suitable for frequency transform of the prediction error image block. In the general video encoding device, the integer DCT means frequency transform by a base obtained by approximating the DCT base with an integer value. The options of base block size include three block sizes of 16×16, 8×8, and 4×4. A larger base block size is selected as the pixel values of the input MB or the prediction error image block are flattened. Information on the selected integer DCT base size is supplied to the entropy encoder 103. Hereafter, the information associated with the selected prediction signal, the information on the selected integer DCT base size, and a quantization parameter to be described later are called auxiliary information.
The inverse transformer/inverse quantizer 104 inverse-quantizes a transform/quantization value with a quantization step width Qs. The inverse transformer/inverse quantizer 104 further performs inverse frequency transform of a frequency transform coefficient obtained by the inverse quantization. The prediction signal (the intra prediction signal or the inter prediction signal) is added to a reconstructed prediction error image obtained by the inverse frequency transform, and the result is supplied to the picture buffer 105 through the switch 121. The operation of the quantizer/inverse quantizer 107 and the linear interpolator 108 will be described later.
A reconstructed image block in which the prediction signal is added to the reconstructed prediction error image block is stored in the picture buffer 105 until all the MBs contained in the current frame are encoded. A picture composed of a reconstructed image in the picture buffer 105 is called a reconstructed image picture below.
The entropy encoder 103 entropy-encodes the auxiliary information and the quantization index, and outputs the results as a bit string, i.e. a bitstream.