Image data can be compressed (encoded) to reduce the amount of data associated with an image without significantly affecting the fidelity of the image. Image compression standards, such as the JPEG (Joint Photographic Experts Group) compression standard, work well to reduce the amount of image data.
In JPEG encoding, the input image is decomposed into MCUs (macro code units or minimum coded units), also referred to as macroblocks. Each MCU includes a number of blocks, typically an array of 8×8 values. A block can be associated with each of the separate image or color components of the image. For example, an MCU may include a luminance block (e.g., a Y-block) and two chrominance blocks (e.g., a U-block and a V-block).
A discrete cosine transform (DCT) is performed to convert each block into frequency space (referred to as DCT coefficients). Typically, most images contain little high frequency information, and so most of the transformed image data is concentrated in the low frequency components. For each 8×8 block, 64 DCT coefficients are produced (one “DC” coefficient and 63 “AC” coefficients). The DCT transformation itself does not reduce the amount of data.
In quantization, some of the frequency information is in essence discarded, so that fewer bits can be used to describe the image. Consider, for example, that there may be 256 possible levels of coloration (e.g., from lightest to darkest) for a pixel. Therefore, prior to quantization, each level would be identified by a unique combination of eight (8) bits. However, using quantization, the 256 possible levels can be quantized into 16 steps of 16 levels each, each step identified by a unique combination of only four (4) bits.
The lower frequency DCT coefficients can be quantized more discretely using a relatively large number of bits, while the higher frequency DCT coefficients can be quantized on a cruder basis using a relatively small number of bits. Thus, lower frequency coefficients might be quantized into 16 steps, each represented using 4 bits as described above, while higher frequency coefficients might be quantized into two steps, each represented by one (1) bit.
The quantization steps applied to the DCT coefficients are arranged in an 8×8 array referred to as a quantization table, such that an entry in the quantization table corresponds to a location in the array of DCT coefficients. The quantization table drives the amount of compression (the “compression ratio”) because it specifies the size of the quantization steps. The larger the quantization steps, the greater the compression ratio, but there will be a commensurate reduction in the quality of the reconstructed (decompressed or decoded) image. Conversely, smaller quantization steps mean that the uncompressed data is more closely represented, thereby increasing the quality of the reconstructed image but reducing the compression ratio.
After quantization, the compression process concludes with run-length encoding (e.g., Huffman encoding) to encode and serialize the quantized data into a bitstream. The size of the bitstream (measured in bits or bytes) varies as a function of the amount of quantization and is also a function of the image data.
A desirable feature of a compression scheme is control of the compression ratio (referred to as “rate control”). Rate control means that a target compression ratio is specified; when the image data are compressed according to the target compression ratio, the length of the resultant bitstream is equal to or less than the target size. With proper rate control, it is possible to efficiently allocate file space for the compressed data or allocate bandwidth to transfer the compressed data, because the required amount of compressed data is roughly known. Otherwise, if too little file space is allocated, then the compressed data will not fit into the allocated file space or may exceed the available transfer bandwidth.
As mentioned above, the compression ratio and the output quality (e.g., the quality of the reconstructed image) are controlled by varying the quantization values. In JPEG encoding, quantization values are selected prior to encoding, and one set of values is applied to the entire image. Unfortunately, for an input amount of data (uncompressed) and a selected set of quantization values, it is not possible to accurately predict the amount of output data (compressed). In fact, the size of the output bitstream can vary significantly from image to image, and in worst cases may even be larger than the input bitstream. This uncertainty in the size of the output bitstream is problematic because, as mentioned above, the amount of compressed data may be too large to properly fit into the allocated file space or may be too large to transfer given an allocated transfer bandwidth.
If the amount of compressed data is too large, then a new set of quantization values may be selected and the data compressed again. The process is repeated until the target compression ratio (e.g., the target bitstream or file size) is achieved. Thus, conventional techniques can require multiple iterations, increasing both encoding time and the use of computing resources (power, memory, processor cycles, etc.). The risk of exceeding the target bitstream or file size can be reduced by choosing larger quantization values, but this comes at the expense of reducing too heavily the quality of the reconstructed image.