The present invention relates to the compression of digital video data, and more particularly to a method and apparatus for adaptively compressing blocks of video image data according to a local coding level. The method and apparatus of the present invention are applicable to any video compression system where the quality level of reconstructed video is periodically adjusted in order to maintain a constant compression ratio.
Data compression is required when a video signal exceeds the data capacity of a communication channel or when it is desired to increase the number of services available on a channel. In such cases, the combined data rate of all services on a channel must be matched to the throughput limit of the channel. This can be done by establishing a coding level that is observed and used by each video encoder to adjust the quality level of the reconstructed video. Since the data rate increases as the image quality is raised and decreases as the signal quality is reduced, it is possible to match the data rate with the capacity of the channel by periodically adjusting the coding level as a function of the data rate that is observed. Short term fluctuations in the instantaneous data rate can be evened out by buffering the compressed signals before and after transmission. In a system where multiple video services use the same channel, the coding level can be shared by observing the combined data rate of all services. Alternatively, each service can maintain an independent coding level by observing its own data rate only.
Various different digital compression systems are known in the art for reducing the amount of data needed to adequately represent a sequence of video images. An example of such a system is provided in Paik, "DigiCipher--All Digital, Channel Compatible, HDTV Broadcast System," IEEE Transactions on Broadcasting, Vol. 36, No. Dec. 4, 1990, incorporated herein by reference. In the system described in the aforementioned article, a highly efficient compression algorithm based on discrete cosine transform (DCT) coding is used. Motion compensation is also provided to further enhance the image compression.
The use of such systems makes it possible to achieve compression ratios in excess of 100:1. Most of the video compression algorithms used in these systems take advantage of statistical properties of the image. Occasionally, certain sequences of images will be encountered where these statistical properties do not apply. In such cases, a constant compression ratio cannot be maintained without visibly impairing the resulting image. In general, the variation in picture quality increases as compression systems become more powerful and more sophisticated. Usually, it is only the average compression ratio that is improved by such systems.
For a wide range of compression algorithms, the most effective way to control the video quality as a function of the coding level is to vary the precision of the quantizers that are applied to the video data. For example, if the coding level specifies that the data rate is to be reduced, then the coarseness of the quantizers may be increased. Similarly, if the data rate is to be increased, then finer quantizers may be used. Other methods of varying the picture quality in response to the established coding level include changing one or more of the video frame rate, spatial sampling rate, and image block size. In the case where differential coding is used (e.g., in connection with motion compensation), the refresh rate may also be varied.
Such systems are effective in matching the data rate of the compressed video to the data rate of the communications channel. These systems do not evenly distribute the available data capacity throughout the image. Typically, the more complex or detailed regions will consume the most bits and the less complex regions will consume the least. Such compensation is generally easy to implement in a transform coding system. In particular, most frequently used video transforms produce transform coefficients with amplitudes that are representative of the energy in a particular band of the frequency spectrum. Therefore, errors can be introduced into selected frequency bands by coarsely quantizing the corresponding transform coefficients. In general, the high frequency coefficients are always quantized more coarsely than the low frequency coefficients. This technique is used in the system described in the aforementioned article to Paik.
Ideally, the quantizer precision used for each different frequency band represented by a block of transform coefficients representing a region of an image would be optimized. In the case where all of the artifacts resulting in an image area are attributable to coarse quantization of transform coefficients, such optimization can be done by experimentally or theoretically determining the ideal allocation of weighting factors for each transform coefficient. As the coding level is varied, the weighting factors can increase or decrease proportionately. Alternatively, more sophisticated schemes can be used in order to better maintain an optimum distribution of errors as the coding level is varied. The intent is to set the coding level in order to achieve the desired compression rate, while at the same time, maintaining an ideal distribution of errors.
Although the described method of distributing errors throughout the different frequency bands of an image has been found to be generally satisfactory, the present invention overcomes a significant problem with the prior art schemes. In particular, it is sometimes impossible to detect the occurrence of errors when an image area is sufficiently random. The reason is that there is so little structure in such areas that a viewer is not certain how the feature is supposed to appear. Random noise is an extreme example. Features with little structure consume a very high percentage of the available signal transmission bandwidth, leaving little bandwidth for other image areas that have more typical video characteristics. The subjective appearance of these other image areas will often be unacceptable, if the coding level is set to maintain the targeted data rate. This occurs because the high data rate required to transmit data for the random areas requires the use of a much lower data rate for the more typical video areas, if the targeted rate is to be achieved.
It would be advantageous to provide a system for adaptively compressing blocks of video image data wherein the coding level is not adversely affected by the occurrence of unstructured regions within an image area. It would be further advantageous to provide such a system that can be efficiently and economically implemented in a digital video compression system. Such a system should maintain a targeted data rate for the compressed video data, without unacceptably reducing the data rate for any portion of an image area.
The present invention provides a system for adaptively compressing blocks of video image data which enjoys the aforementioned advantages.