The present invention relates to data compression, and more particularly to a bit rate control mechanism for digital image and video data compression that estimates the number of bits required to represent a digital image or a video at a particular quality in compressed form or alternatively estimates the quality achievable for a digital image or a video when compressed to a given number of bits, which estimates are used to control the number of bits generated by a video compression system.
Visual information may be represented by digital pictures using a finite amount of digital data for still images, and by a finite data rate for time-varying images. Such data in its uncompressed form contains a considerable amount of superfluous information. Image compression techniques attempt to reduce the superfluous information by minimizing the statistical and subjective redundancies present in digital pictures. Pulse code modulation, predictive coding, transform coding, interpolative/extrapolative coding and motion compensation are some of the tools used in image compression techniques.
A digital video/image compression technique may be either lossy or lossless. The lossy compression techniques introduce an irreversible amount of distortion into the picture data. In these techniques a trade-off is made between the amount of distortion added to the original picture input and the number of bits the compressed picture occupies. A rate controller in a video/image compression system controls the number of bits generated by altering the amount of distortion added to the original input by the compression system. In other words a rate controller in a video/image encoder controls the number of bits needed to represent the compressed image by changing the quality of the decompressed image.
Transform coding techniques take a block of samples as the input, transform this block into a number of transform coefficients, quantize the transform coefficients, and variable or fixed length encode the quantized transform coefficients. The input to the transform coding system may be either the original picture elements (pixels), such as in JPEG and intra-MPEG, or the temporal differential pixels, such as in inter-MPEG. An adaptive still image coding technique using a transform coder with a rate controller is shown in FIG. 1. An input image block is transformed by a discrete cosine transform (DCT) function, quantized and variable length coded (VLC). The rate controller observes R(n-1), the number of bits generated by the previous block, and selects a quantizer scale factor Q(n) for the current block. A still image coding scheme, such as JPEG, may be used on a motion picture, as shown in the simplified block diagram of FIG. 2. In these schemes the rate controller observes R(n-1), the number of bits generated by the previous frame (field), and selects a quantizer scale factor Q(n) for the current frame (field). A simplified block diagram of an MPEG encoder is shown in FIG. 3, where R(n-1) is the typical number of bits generated in the previous macroblock. For JPEG Q(n) is referred to as a factor or quality factor, and for MPEG it is referred to as mquant.
In all of the schemes shown in FIGS. 1-3 Q(n) is used to scale the step sizes of the quantizers of transform coefficients (quantizer matrices). Increasing Q(n) reduces R(n) and vice versa. Q(n) is selected so that R(n), the number of bits generated with this quantizer scale factor Q(n), is close to the targeted rate for the block, frame or field. Q(n) also is an indication of the quality of the decoded block, frame or field. To perform efficiently, a rate control algorithm requires a good estimate of the rate-quality relationships for the input data, i.e., R(n) vs. Q(n). A good rate controller would come up with a Q(n) that results in a targeted R(n). The targeted R(n) for a block, frame or field could vary with n. For example it might take into account the visual characteristic of the block in question, whether the coding is variable bit rate (VBR) or constant bit rate (CBR). A good rate controller tries to keep the Q(n) smooth over n so that the resulting quality of the decoded picture is smooth as well.
Given actual R(n-1), the actual bits generated for the preceding block number n-1, Chen et al, as described in "Scene Adaptive Coder" from IEEE Trans. Communications March 1984, compute Q(n) in the following manner. A buffer status B(n-1) after coding block n-1 is recursively computed using EQU B(n-1)=B(n-2)+R(n-1)-R
where R is the average coding rate in bits per block. From the buffer status B(n-1) the quality factor Q(n) is computed through EQU Q(n)=(1-.gamma.)*.phi.(B(n-1)/B)+.gamma.*Q(n-1)
where .phi.{ } is an empirically determined normalization factor versus buffer status curve and B is the rate buffer size in bits. This produces a smoothly varying Q(n) depending on .gamma.. .gamma. is taken to be less than unity.
Alternatively the Test Model Editing Committee, International Organization for Standardization, Test Model 3 (Draft), December 1992 computes Q(n) in a similar way as follows. First the virtual buffer status B(n-1) is computed as above. Then Q(n) is computed through the linear relation EQU Q(n)=K.sub.R *B(n-1)
where K.sub.R is a constant that depends on the targeted average bit rate. This Q(n) may be further scaled based on the visual complexity of the block being coded.
Using these techniques Q(n) could change rapidly, and there is no estimate of the quality achievable for a particular block, frame or field with a given number of bits. What is desired is a rate control mechanism that estimates the quality achievable for a digital image or video when compressed to a given number of bits or alternatively estimates the number of bits required to represent a digital image or video at a particular quality in a compressed form.