1. Field of the Invention
The present invention relates to an image compression/encoding apparatus and an image compression/encoding method for compressing and encoding a video signal. More particularly, the present invention relates to how to control a quantization width by which the video signal is encoded, how to control a bit generation number, and the like.
2. Description of the Related Art
There are known techniques for compressing/encoding a video signal such as the international standard format of digitally compressed/encoded data described in ISO/IEC 13818-2 (commonly known as "MPEG2"), in which the decoding method therefor is also described. Moreover, a typical example of a method for encoding a video signal into this format is described in "Test Model 3" of ISO-IEC/JTC/SC29/WG11 NO328.
According to the MPEG2 encoding, a video signal is processed by compensation and estimation of motions between pictures and by the DCT (discrete cosine transform) encoding of estimation errors. When quantizing the transform coefficients of the DCT, the bit generation number varies (e.g., increases/decreases) depending upon the quantization width. In order to compress the video signal into a desired amount of data, the quantization width is controlled in accordance with the bit generation number consumed for the encoding process so as to adjust the data amount. "Test Model 3" of ISO-IEC/JTC/SC29/WG11 NO328 describes a method for determining the quantization width for the video signal such that the video signal can be reproduced by inputting the encoded image data to a decoding apparatus at a desired and fixed bit rate.
"Test Model 3" calculates a target bit number from a target bit rate for each GOP (Group Of Pictures) including a plurality of pictures, assigns the target bit number to each of the I, P and B pictures of the GOP, and encodes these pictures. Herein, an I picture is a picture created at the beginning of each GOP; a P picture is a picture which is created for every certain number of pictures; and a B picture is a picture which is created between the I picture and the P picture.
This method determines the target bit generation number for the current GOP through some adjustment based on the bit generation number consumed for encoding a past GOP, and assigns the target bit generation number to the pictures of the current GOP. Therefore, when the proportion of the information amount between the I, P and B pictures is considerably different from that of the past GOP, the number of bits to be assigned to the I, P and B pictures may be inappropriate. The proportion of the information amount between the I, P and B pictures often shows a two-fold or greater change under an ordinary situation. If there is a scene change, even a B picture may require a number of bits which would normally be required by an I picture. Moreover, if there is a scene with a more complicated image, the amount of information itself may increase ten-fold. In such cases, the number of bits to be assigned to the picture becomes insufficient, whereby the quality of the reproduced image for the picture may be deteriorated significantly.
Moreover, since the bit generation number is controlled for each GOP, when a scene change, or the like, occurs near the end of a GOP (i.e., in the last several pictures of the GOP), for example, the number of bits to be assigned will be insufficient by a long shot. In such a case, the degree of difficulty of encoding the picture sharply increases, whereby the quality of the reproduced image for the picture will deteriorate significantly.
The method performs an adjustment such that the actual bit generation number is as close to the target bit generation number as possible by calculating the target bit generation number and controlling the quantization width of the video signal based on a comparison between the target bit generation number and the actual bit generation number, while assuming that the number of bits to be consumed for encoding each of the macroblocks included in one picture is constant throughout the picture.
However, the amount of information may vary significantly depending upon the relative position of a video image in a picture with respect to the background of the picture, whereby the bit assignment in that picture cannot be performed satisfactorily. For example, when a complicated pattern, or the like, exists in a macroblock in the latter part of the picture, most of the bits are assigned and consumed in the simple first part of the picture, thereby resulting in an unexpectedly large bit generation number in the latter part. Conversely, when the first part has a complicated pattern, the bit generation number is suppressed more than necessary in the first part, and a bit generation number more than necessary is assigned in the latter part. Thus, no bit may be available to be assigned in the part of the picture where some bits are required to be assigned.
Moreover, although this method is a control method which adjusts the actual bit generation number to the target bit generation number, it does not consider the fullness of the VBV buffer (virtual buffer) virtually provided in the decoding apparatus. Therefore, it is necessary to modify the bit distribution for the pictures by using a constraint from the VBV buffer. This has led to further deterioration in the quality of the reproduced image.
In order to prevent the quality of the reproduced image from being unstable and deteriorating owing to such a variation in the degree of difficulty of encoding a video signal, it is necessary to maintain the quantization width for the video signal as constant as possible. However, when the quantization width is fixed, the bit generation number increases according simply to the degree of difficulty of encoding images, whereby it is difficult to encode a video signal at the target bit rate.