1. Field of the Invention
The invention relates to data compression and in particular, compression schemes which seek to make more efficient compression processes that attempt to select optimal quantizers.
2. Background
Storing and transmitting large amounts of data is a perennial problem. For example, the amount of data in a digitized video is tremendous and would choke the storage and transmission capacities of most devices if there weren't practical ways to throw away redundant and unimportant information. This is the business of data compression; one simply doesn't bother to store or transmit the information about the data stream that can be predicted fairly accurately. If prediction is 100% accurate, then the compression scheme is called “lossless.” If it is less than 100% accurate, it is called “lossy.”
Compression of data streams is also used in other environments. For example, images can be compressed in a lossy way. One well-known type of image compression, Joint Photographic Experts Group (jpeg), takes advantage of correlation between neighboring pixels to predict their values and thereby reduce the quantity of data that must be stored or transmitted for a given quality level. There are many other lossy image compression schemes.
Lossy schemes, by definition, cannot generate perfect distortionless representations of their antecedents. In many compression schemes, it is possible to pre-select an allowed maximum distortion level and store or transmit the minimum amount of data (“bit-rate”) required to provide it. Alternatively, the bit-rate can be pre-selected and the distortion minimized for the specified “bit-budget.”
Distortion is a concept that depends on the type of data compression and can involve subjective criteria. For example, the degree of distortion suffered in a given compression/decompression cycle depends, in part, on aspects of the human visual system. For example, humans do not see color, for example, at the same detail as luminance (the relative lightness and darkness of portions of an image). As a result, distorting a data stream so that those data that contribute the least to perceived quality are thrown out before those that contribute more to perceived quality is an optimal approach to compression. (Obviously, data that contribute nothing to image quality—i.e. they are redundant—would be at the top of this list.) This approach to compression defines an “optimization problem” called “optimal bit allocation” for “rate-distortion compression.” Besides the subjective notions of what defines optimality, there are other more concrete aspects to optimality. For example, one could choose to minimize the average distortion, sacrificing some portions of the data stream to enhance others to achieve an overall optimum. Alternatively, one could minimize the maximum distortion of every specified portion of the data stream. What defines optimality is thus a complex and evolving concept and is intended, in the instant specification, to refer to any specific metric.
It has long been known that video data is a superb candidate for compression with little perceptual distortion. The technical way of saying this is that the raw data has very low entropy; that is, any given portion in time and or space is predictable with high probability from other portions in time and space. For example, a first frame tends to look a great deal like the next frame, even in a fast-paced video sequence.
A great variety of different schemes have been created for compactly defining video. For example, a video frame can be defined in terms of where each of the segments of the image moved from the previous frame plus a “difference” frame that contains only the details lost by reconstructing the frame from just the motion data. Compression schemes that use this technique are called Motion Compensated Video Coders (MCVC). The combined data stream may be highly compact because much of the change in successive frames of a video can be characterized by gradually shifting fields of color and luminance.
Many of the different schemes for compressing video data may be used in concert. There are also many methods for compressing audio, still image, and other kinds of data. The optimization problem can be complicated for these so-called predictive schemes in which information is gathered from portions of the signal that are adjacent either in time or space. Inherent in any compression scheme is the substitution of raw data by some symbol which represents that data. For example, when a single datum is converted to digital data, the transmitter (or storage device) must resolve the tradeoff between precision and waste. More precision requires more data. This is a simple illustrative choice between different “quantizers.” Modern data compression problems involve much more complex choices. For example, in MCVCs, which represent a video stream by transmitting the movement of a portion of the video frame, the choice of quantizers may involve selecting the size of the portion that can be effectively represented by motion vectors. The choice of how to divide up one frame into moving portions affects the prediction value for the next successive frame. The result is that the amount of distortion accrued in a temporal, or spatial, succession of choices of how to translate raw data into symbols results in a complex planning problem where the choices of quantizers for one portion of the data stream affects the distortion (or bit-rate) for other portions of the data stream. That is, the quantizers are dependent.
In some video compression schemes, the use of a tool called dynamic programming (DP) has been proposed to solve the optimization problem posed by such compression schemes. While DP is robust (i.e., it always works), it is computationally intensive. The result is that either the quality of the optimization must suffer or the cost of computing the optimum must be high. Thus, there is a need in the prior art for ways of addressing the optimization problem posed by rate-distortion compression schemes with lower computational overhead.