Standards bodies such as the Moving Picture Experts Group (MPEG) and the Joint Photographic Experts Group (JPEG) specify general methodologies and syntax for generating standard-compliant files and bit streams. Generally, such bodies do not define a specific algorithm needed to produce a valid bit stream, according encoder designers great flexibility in developing and implementing their own specific algorithms in areas such as image pre-processing, motion estimation, coding mode decisions, scalability, and rate control. This flexibility fosters development and implementation of different algorithms, thereby resulting in product differentiation in the marketplace. However, a common goal of encoder designers is to minimize subjective distortion for a prescribed bit rate and operating delay constraint.
In the area of bit-rate control, MPEG and JPEG also do not define a specific algorithm for controlling the bit-rate of an encoder. It is the task of the encoder designer to devise a rate control process for controlling the bit rate such that the decoder input buffer neither overflows nor underflows. A fixed-rate channel is assumed to carry bits at a constant rate to an input buffer within the decoder. At regular intervals determined by the picture rate, the decoder instantaneously removes all the bits for the next picture from its input buffer. If there are too few bits in the input buffer, i.e., all the bits for the next picture have not been received, then the input buffer underflows resulting in an error. Similarly, if there are too many bits in the input buffer, i.e., the capacity of the input buffer is exceeded between picture starts, then the input buffer overflows resulting in an overflow error. Thus, it is the task of the encoder to monitor the number of bits generated by the encoder, thereby preventing the overflow and underflow conditions.
One common method for bit-rate control in MPEG and JPEG encoders, which employ Discrete Cosine Transformation (DCT), involves modifying the quantization step. However, it is well known that modifying the quantization step affects the distortion of the input video image. The distortion of the lower DCT coefficients causes “blockiness,” while distortion of the higher DCT coefficients causes blurriness. It is well know that the Human Visual System (HVS) prefers greater distortion for higher frequency DCT components than for lower frequency components. This is because, generally speaking, most image content is in the low frequency range. This is due to a high correlation between adjacent pixels. Unfortunately, known MPEG and JPEG encoders that attempt to control bit-rate by modifying the quantization step do not distribute the distortion between low and high frequency coefficients in a way that is optimal for the HVS. For example, when using uniform quantizers, uniform distortion is caused among low and high frequency components. This is not optimal for HVS which prefers more distortion among high frequency components rather than among low frequency components. By contrast, quantization matrices cause more distortion among high frequency components than among low frequency components, which HVS prefers. However, quantization matrices operate on a per-coefficient basis (i.e., point process) that provides only a rough HVS optimization.
In MPEG and JPEG processing, DCT coefficients are ordered in a “ZigZag” scan and numbered 0-63 in ascending order. Both uniform quantizers and quantization matrices attempt to create sequences of successive zeroes at the end of the scan, since the longer the zero sequence, the fewer variable length coding bits are needed for coding the block, especially when long sequences of zeroes appear at the end of the “ZigZag” scan order. However, neither uniform quantizers nor quantization matrices ensure the creation of sequences of successive zeroes in a deterministic way.
Another method for controlling the bit rate involves discarding high DCT coefficients and only transmitting low DCT coefficients. This method is applied during rate control only when the output bit rate is higher than the target bit rate. This will produce visible artifacts, such as a strong “blurriness effect,” in the decoded video image, which human viewers generally find unacceptable. This type of artifact requires that some blocks within a picture be coded more accurately than others. In particular, blocks with less activity require fewer bits than blocks with high activity.