Motion compensation is known to have a high coding efficiency and is often used in combination with orthogonal transforms, such as the discrete cosine transform (DCT), for encoding full-motion video. For example, CCITT H.261 ("Codec for Audiovisual Services at n.times.384 kbit/s," Rec. H.261, pgs. 120-128, 1988), CCITT Rec. G.702 ("Transmission of Component-Coded Digital Television Signals for Contribution-Quality Applications at Third Hierarchical Level of CCITT Recommendation G.702," CMTT 303-E, pgs. 1-18, Oct. 17, 1989), and the ISO Draft International Standard (DIS) 41172 ("Coding for Moving Pictures and Associated Audio," ISO/IEC JTC 1/SC 29 N 071, pgs. 2-A-18, Dec. 6, 1991) all employ a hybrid coding method in which DCT is applied to motion-compensated differential signals. A DCT interframe coder representative of coders from the prior art as described in "Visual Telephony as an ISDN Application," Ming L. Liou, IEEE Communications Magazine, pgs 30-38, February 1990, is shown in FIG. 1.
Video coders that are used in standards such as the CCITT H.261 and the ISO DIS 41172 are inherently variable rate coders because the parameters in the coder are coded using Variable Length Coding (VLCs) 200. As shown in FIG. 1, the variable length coded signals are buffered at buffer 300. The buffer occupancy (the quantity of generated bits) is fed back via lead 301 to the rate controller 150. The rate controller 150 adjusts the quantization step size in quantizer 11 to avoid buffer overflow or underflow while maintaining a constant rate at the output 302 of the buffer 300.
A control method, which maintains a constant coding rate by adjusting the quantizer step size based upon buffer occupancy, is commonly used in video coders referenced in various standards. For example, the Reference Model 8 ("Description of Ref. Model 8 (RM8), CCITT SGXV Working Party XV/4, Document 525, pgs. 28-30, Jun. 9, 1989) as used in recommended operations in CCITT H.261 and the Simulation Model 3 (SM3) for the ISO DIS 41172 employ this method of rate control. Under this rate control method, the quantizer step size is calculated by dividing the buffer occupancy by a predetermined value obtained empirically through experiments.
The quantizer step size is transmitted as part of the output bit stream, appearing at output 302, several times every frame for each group of blocks (GOB) in the H.261 method and for each slice in the ISO DIS 41172 method and is used to control the coding rate. In the industry, a slice refers to one or more adjacent horizontal row of macroblocks within a frame of video, and a GOB refers to several rows and columns of macroblocks within a frame of video. After processing of each slice or GOB, the buffer occupancy can be determined, which is the number of bits actually generated to encode the slice or GOB. The number of target bits is pre-assigned according to the interval between intracoded pictures and the type of picture, which can be inferred from the type of method used to encode the picture (such as intraframe coding, interframe prediction coding, and interframe interpolation prediction coding).
The aforementioned rate control method, which adjusts quantization step size based on buffer occupancy, has the following shortcomings:
1) Adjustments in the quantizer step size is based solely on the buffer occupancy, but does not take the content (texture) of the image into account. As a result, the distribution of visual distortion over the image is not uniform and viewer perception of picture quality is detrimentally affected.
2) Transition from one quantizer step size to the next quantizer step size is not necessarily smooth because quantizer step size is directly controlled by the buffer occupancy calculated for the previous slice (or GOB). Thus, the number of bits generated to code a slice or GOB governs the quantization of the next slice or GOB.
3) Although both the ISO DIS 41172 and CCITT H.261 standards allow updating quantizer step sizes more frequently, if necessary, by using additional overhead information, the aforementioned rate control strategy does not satisfactorily handle boundary changes in image content occurring midway through a slice (or GOB). For example, a slice which includes the boundary between a clear blue sky (a flat, low texture image) and a flower garden (a high resolution, high texture image) cannot be accommodated with the aforementioned rate control method, because the method depends only on the buffer occupancy and does not take image content into account.
4) In the Simulation Model 3 (SM3), the quantizer step size is calculated by dividing the buffer occupancy by 2000. This value, however, is empirical; it was obtained through experiments and does not guarantee that the total bits generated will be sufficiently close to the number of target bits to prevent buffer overflow and underflow.
5) Lastly, in the Simulation Model 3 (SM3) which employs intraframe coding, a large number of bits are generated and data in the first slice of the next frame is coded with a large quantizer step size regardless of the image content causing pronounced coding distortion on the upper part of the image. If a slice contains both high-texture and low-texture patterns, more pronounced coding distortion in the slice results under this method. Since slices in the lower part of the image are coded after adjustments have been made in the quantizer step size, the slices in the lower part of the image appear to have much less distortion than slices in the upper part of the image. Images having such boundaries and such a difference in the distortion between upper and lower parts of the images are perceived by viewers to have a worse overall picture quality than an evenly distorted image.