Improving digital imaging technology allows for increasingly higher resolution and color variation in digital images. As image quality increases, however, resulting image data files increase geometrically in size. Image compression technologies strive to reduce the storage required to store image data and the bandwidth needed to transmit image data.
Image compression technologies seek to balance competing interests. On one hand, it is desirable to compress the size of a data file as much as possible so that the compressed file will consume the least amount of storage or bandwidth. On the other hand, the more a data file is compressed, the more computing resources and time are consumed in compressing the file.
FIG. 1 shows a functional block diagram of a representative encoder 100 and decoder 140 pair used to compress and decompress source data 102, respectively. For sake of example, the source data 102 includes image or video data. The encoder 100 receives the source data 102. In one embodiment, the encoder 100 first presents the source data 102 to a preprocessor 104. The preprocessor 104 separates the source data 102 into luminosity (grayscale) and chrominosity (color) components.
The output of the preprocessor 104 is presented to a transformer 106 that performs frequency transformation on the output of preprocessor 104. The transformer 106 may perform discrete wavelet transformation (DWT), discrete cosine transformation (DCT), fast Fourier transformation (FFT), or another similar frequency domain transformation on the preprocessed data. Individual data values vary less from neighboring values in transformed, frequency domain data, as compared to the spatial domain data.
Taking advantage of the less variant data values in the frequency domain data, the quantizer 108 identifies and aggregates data values having identical values, replacing a repeating series of identical data values with one instance of the data value combined with an indication of how many times the identical data value repeats. Similarly, the quantizer may combine a series of similar but not identical values with a single identical value when data values representing them with data points of equal value when the data values fall within a particular tolerance. Aggregating similar but not identical data values is used in lossy compression where some degradation of the original image is acceptable.
The output of the quantizer 108 is presented to an entropy coder 110 that generates the compressed image data 120. Generally, entropy coding compresses data by identifying or predicting the frequency with which data values occur in a data file. Then, instead of representing each data value with a fixed, equal-length value, entropy coding represents more frequently appearing data values with shorter binary representations. By replacing frequently appearing data values with shorter representations instead of fixed, equal-length representations, the resulting compressed data 120 is reduced in size.
The compressed data 120 generated by the entropy coder 110 is presented to a channel 130. The channel 130 may include data storage and/or data transmission media. A decoder 140 receives or retrieves the compressed data 120 from the channel 130 and decompresses the compressed data 120 through a mirror image of the process applied by the encoder 100. The compressed data 120 is translated by an entropy decoder 142, a dequantizer 144, an inverse transformer 146, and a postprocessor 148 that ultimately presents output data 150, such as image or video data suitable for presentation on a display or other device.
The entropy coder 110 uses a probabilistic context model to determine which values are assigned shorter and longer codes by predicting or determining which data values to appear more and less frequently, respectively. The context model includes a plurality of conditioning states used to code the data values. The context model used by the entropy encoder 110 may be a static model, developed off-line and stored both with the encoder 100 and the decoder 140. However, because the frequency with which data values may vary substantially between different data files, using a universal context model may not result in effective compression for every data file. Alternatively, a context model may be developed for each data file. The context model used by the entropy coder is stored and transmitted as part of the compressed data 120, so that the context model is available to the entropy decoder 142 to decode the compressed data 120.
Compression may be increased by using a higher order context model. A high order context model includes a large number of conditioning states for coding the data values, thus allowing for the possibility of higher coding efficiency in coding data values with the fewer bits. However, a higher order context model not only includes a large number of predicted values, but the conditioning states themselves are of a higher order. Thus, the higher the order of the context model, the more storage or bandwidth the context model consumes.
Further, if the order of the model is too high, a higher order context model may actually reduce coding efficiency. If too high an order context model is used, the coded data values may not converge sufficiently to meaningfully differentiate between data values occurring more and less frequently in the input data. This problem commonly is known as “context dilution” or “model cost,” and reduces efficiency of the entropy coder.
One solution to address the content dilution problem is context quantization. Context quantization encodes values based on a selected subset of conditioning states representing data values from an area adjacent the data value being coded. Because of the complexity of finding good conditioning states and the significant overhead of representing the found conditioning states presented by the quantizer, conventional context quantizers are trained offline from a training set of data values. However, as previously described, the frequency with which data values appear in different sets of data will vary. Thus, quantizing a context model on training sets may not consistently provide effective compression.
Further complicating matters is that a context model generated or quantized for coding a source at one bit rate may not work as well for coding a source at a different bit rate. For example, a context model suitable for coding a source at a high bit rate, where more samples of the source are provided, may pose a pronounced context dilution concern when coding using a low bit rate. Different models may be created for different bit rates, but creation, storage, and/or transmission of different models for a number of different bit rates consumes processing, storage, and bandwidth resources, respectively.