The present invention relates to an apparatus for and a method of variable bit rate video coding and, more particularly, to those in which the generated code bit count is controlled in real time at a variable bit rate.
MPEG-2 (ISO/IEC-13818-2), for instance, is well known in the art as a method of high efficiency video coding. In this method, the picture is divided into a plurality of blocks each constituted by a group of a plurality of pixels, and a special domain signal is converted to a frequency domain signal by executing discrete cosine transform (DCT) on each block. Each frequency component called DCT coefficient obtained by the discrete cosine transform, is quantized with a quantization step size, which is predetermined for each coefficient position (i.e., frequency). The quantized data is variable length coded by assigning variable length codes to the DCT coefficients. The coded data thus obtained is provided as a bit stream.
FIG. 8 is a view for describing an image coding method conforming to the MPEG-2 (ISO/IEC-13818-2). Referring to the Figure, the method is implemented by a system comprising a subtracter 501, a discrete cosine transform unit 502, a quantizer 503, a variable length coder 504, a bit rate controller 505, an inverse quantizer 506, an inverse discrete cosine transform unit 507, an adder 508, a frame memory 509, a motion compensated inter-frame predictor 510 and a selector 511.
Input picture images are classified into I pictures to be intra-frame coded, P pictures to be inter-frame prediction coded by using only forward prediction and B pictures, for which prediction in both forward and backward is made. The input image is divided into a plurality of macro-blocks each of 16×16 pixels for coding the image in units of macro-blocks.
I pictures are intra-frame coded without the inter-frame prediction. In this case, the selector 511 selects signal value of “0” as prediction value, and the same value is outputted as input image signal value from the subtracter 501. The discrete cosine transform unit 502 executes discrete cosine transform on the output signal value from the subtracter 501, and thus outputs a DCT coefficient. The quantizer 503 quantizes the DCT coefficient with a predetermined quantization step size.
The quantizer 503 outputs a quantized DCT coefficient as quantized coefficient to the variable length coder 504. The variable length coder 504 encodes the quantized coefficient with variable length codes. The variable length coded quantized coefficient is outputted together with other data as a bit stream. The inverse quantizer 506 inversely quantizes the quantized coefficient output of the quantizer 503. The inverse discrete cosine transform unit 507 executes inverse discrete cosine transform on the inversely quantized coefficient output of the inverse quantizer 503 thus reconstructing the original input image signal.
The reconstructed image signal that is obtained through the inverse quantizer 506 and the inverse discrete cosine transform unit 507, is used as reference image in subsequently executed inter-frame prediction, and is stored through the adder 508 in the frame memory 509. The motion compensation inter-frame predictor 510 executes inter-frame prediction on the P and B pictures. The predictor 510 first determines a motion vector by comparing the input image signal and the reference image stored in the frame memory 509 and executing motion estimation for each macro-block as a division of the input image. Then, the predictor 510 determines a prediction mode based on the result of the motion estimation for each macro-block. In this process, it is determined whether it is better to or not to set an intra-frame mode, in which a macro-block coding process is executed on P and B pictures.
When it is determined that it is better to set the intra-frame mode for coding process determined in a macro-block prediction mode, the macro-block coding process on P and B pictures is the same as the above coding process on I pictures, that is, intra-frame coding process is executed without motion compensated inter-frame prediction according to any determined motion vector.
When it is determined that it is better not to set the intra-frame mode, that is, it is better to set the non-intra-frame mode in the macro-block prediction mode, the macro-block coding process on P and B pictures is executed with infra-frame prediction by using the image signal recorded in the frame memory 509 as reference image. The motion compensated inter-frame predictor 510 generates a predicted image signal corresponding to input image signal on the basis of a previously determined motion vector.
The predicted image signal thus generated is provided to the subtracter 510 and also to the adder 508. In the following, the predicted image signal supplied to the subtracter will be described. Receiving the predicted image signal, the subtracter 501 computes the differential signal between the input image signal and the predicted image signal. The subsequent coding process on the differential signal is the same as the coding process on I pictures. That is, the discrete cosine transform unit 502 generates DCT coefficients from the differential signal, and the quantizer 503 quantizes the DCT coefficients.
The differential signal outputted from the quantizer 503 is provided to the variable length coder 504 for providing a bit stream, and it is also provided to the inverse quantizer 506. Through the inverse quantizer 506 and the inverse cosine transform unit 507, a reconstructing process is executed on the coded differential signal. This reconstructing process is the same as the reconstructing process on P and B pictures as described above. Specifically, the adder 502 adds the reconstructed differential signal and the predicted image signal previously provided thereto, and thus a reconstructed image signal is generated. The reconstructed image signal generated in the adder 508 is stored in the frame memory 509 to be used as reference image in a subsequent coding process by inter-frame prediction.
As shown above, in a method represented by MPEG-2(ISO/IEC-1388-2), for instance, an efficient coding process can be executed on the input video signal since it is possible to reduce the redundancy in the spatial domain in DCT and also reduce the time domain redundancy in the inter-frame prediction. In this method, the bit rate controller 505 which is provided between the variable length coder 504 and the quantizer 503, obtains the generated bit count from the variable length coder 504, then determines the quantization step size such as to meet a bit rate restriction, and transmits the determined quantization step size data to the quantizer 503 for controlling the generated bit count.
Prior art examples of coding process employed in the above method typically represented by the MPEG2(ISOP/IEC-13818-2) will now be described.
As a first prior art example, a bit rate control method in a test model (i.e., Test Model 5, ISO/ICE JTC1/SC21/WG11/NO400, April in 1993) of MPEG-2 is well known as a bit rate control method in a coding method, which involves the above quantizing process.
This method adopts a constant bit rate coding method, which seeks to set a generated bit count in intra-frame coding process and coding process by inter-frame prediction to constant bit count for each of certain time units. Specifically, in this method the bit count is controlled by adjusting the quantization step size set for each macro-blocks obtained by dividing the picture frame into 16×16 pixels.
In this first prior art example, since the constant bit rate control method is adopted, it is necessary to control the bit count per unit time to be constant. The generated bit count is reduced by increasing the quantization step size for scenes, which requires high bit rate code generation, and increased by reducing the quantization step size for scenes, which do not require so high bit rate code generation.
In this example, however, the quantization step size is set to control the bit count per unit time to be constant independently of input images. Therefore, in a scene requiring many bits, the image quality is degraded, because the generated bits are suppressed by a large quantization step size.
To replace the constant bit rate coding method of the above first prior art example, a variable bit rate coding method has been proposed. In an example of variable bit rate coding method, when an encoder records images on storage media which have a predetermined total recording bit count, the output bit rate is controlled according to the bit count required for each image, thus improving the average image quality while meeting the restriction imposed on the recording capacity of the storage media.
As a second prior art example of coding process, Japanese Patent Laid-Open No. 6-141298 discloses a process, which adopts the above reliable bit rate control method. In this second prior art example, a provisional coding process is first executed, in which DCT coefficient quantization is executed on the basis of a preliminarily set reference quantization step size for computing generated bit count, and then a regular coding process is executed for actual coding. This coding process is thus a “two-pass coding” system, in which the regular coding process is executed after completion of the provisional coding process. In other words, re-coding is executed after obtaining the knowledge of the characteristics of the whole contents, and it is thus possible to attain high image quality coding.
In this second prior art example, however, the coding of a certain video content requires execution of two processes, i.e., the provisional and regular coding processes, thus doubling the process time. Therefore, it is impossible to encode in real time processing.
For real time execution of the variable bit rate coding, a “single path coding” system, which does not execute any provisional coding process but executes the single regular coding process, has been considered.
As a third prior art example of coding process, “An Algorithm of MPEG2 Real time Variable Bit Rate Video Coding with Quantizer Step” (Inada et al, Information System Journal 2 D-11-3 (page 3), General Conference of the Institute of Electronics, Information and Communication Engineers in March, 1998), is well known as “single pass coding” system in the above variable bit rate coding system.
In this third prior art example, an image coding process conforming to MPEG-2 is executed, and after a predetermined period of time the generated bit count is adjusted by setting a quantization step size for each GOP to provide a predetermined average bit rate.
Japanese Patent Laid-Open No. 10-164577 discloses a fourth prior art example of coding process. In this fourth prior art example, the actual generated bit count is used to update the bit count which can be assigned to subsequent several GOPs, and a target bit count or a target quantization step size is determined for each GOP on the basis of the ratio of the coding complexity of the present image to the average coding complexity over past images.
However, the above signal pass variable bit rate coding systems have a problem that a series of coding processes may result in deterioration of the image quality. This is so because in the single pass coding systems a bit count capable of being assigned or a target bit rate is computed before computation of the target bit count or quantization step size. This means that preference is given to bit rate control in a fixed time period, resulting in insufficient quantization step size control depended on the image scene. Consequently, the coded image quality may not substantially differ in comparison to the constant bit rate case.
In addition, in the case of that the bit count is less than the average bit rate although the image does not require high bit rate code generation in low complexity scenes, the generated code bit count is sought to be increased by reducing the quantization step size in spite of the fact that sufficient image quality is obtained. This may result in an excessive code assignment and wasteful code consumption. Consequently, it may become impossible to assign sufficient bit count with an image which requires more bits in high complexity scenes.