1. Field of the Invention
This invention relates to computer systems, and more particularly to data compression mechanisms, specifically transform coding mechanisms as applied to discrete signals.
2. Description of the Related Art
In computer science and information theory, data compression is the process of encoding information using fewer bits than an unencoded representation would use through use of specific encoding schemes. Discrete signals such as discretely sampled audio, video, and images are typically compressed using transform coding. In transform coding, a discrete transform is first performed on the signal to generate a set or vector of transform coefficients. Transform coefficients that are at or near zero may be discarded, and transform coefficients that are small may be coarsely quantized according to one or more quantization factors. Note that quantization may be defined as the representation of a measured value by an integer. This process concentrates meaningful information from the signal in a subset of transform coefficients, which may be referred to as the transformed signal. A lossless entropy encoding scheme or technique may then be applied to the transformed signal to generate a compressed signal, for example a compressed file. Note that transform coding is generally a “lossy” compression scheme. Further note that applications or implementations of transform coders (in hardware, software, or a combination thereof) may sometimes be referred to as “codecs”.
Entropy encoding is a coding scheme that assigns codes to symbols to match code lengths with the probabilities of the symbols. Typically, entropy encoders are used to compress data by replacing symbols represented by equal-length codes with symbols represented by codes proportional to the negative logarithm of the probability. Therefore, the most common symbols use the shortest codes.
There are many discrete transform methods that may be used in transform coders. Discrete transform methods include, but are not limited to, forms of discrete Fourier transforms (DFT), discrete cosine transforms (DCT), discrete Hartley transforms (DHT), rectangular wave transforms, eigenvector-based transforms, and wavelet transforms.
Likewise, there are many entropy encoding techniques that may be used in transform coders. Entropy encoding techniques include, but are not limited to, dictionary-based techniques such as run-length encoding (RLE) and LZW encoding, statistical encoding techniques such as Huffman coding, and arithmetic coding. Note that an entropy encoding scheme in a transform coder may use a combination of two or more of the entropy encoding techniques. For example, RLE may first be applied to a signal, followed by Huffman coding.
Commonly used applications of transform coders include, but are not limited to, the JPEG, JPEG2000, and MPEG compression standards. JPEG is an image compression standard created by the Joint Photographic Experts Group that uses a two-dimensional forward discrete cosine transform (DCT, type II) to generate a transformed signal. RLE is then performed on the transformed signal, after which Huffman coding is performed. The JPEG standard also allows for arithmetic coding to be performed rather than Huffman coding. JPEG 2000 is an image compression standard created by the Joint Photographic Experts Group that uses a wavelet transform rather than a DCT transform. MPEG is a video/audio (multimedia) compression standard created by the Moving Picture Experts Group, a working group of ISO/IEC.
FIG. 1 illustrates data flow for a conventional transform coder. Inputs to a transform coder 110 may include an uncompressed signal 102 and one or more compression parameters 104. Source 100 of uncompressed signal 102 may be a digitized image (e.g., a digital photograph), a digitized audio stream or audio file, a digital video stream or video segment, etc. Note that uncompressed signal may be a subset or portion of a larger signal, for example a portion of an image, audio stream, or digital video stream. Input compression parameters 104 may include one or more variable parameters that may be used during the compression process. Compression parameters 104 may affect both the compression ratio (the ratio of the size of the output compressed signal 150 to the input uncompressed signal 102) and the quality of the compressed signal 150. Generally, the quality is directly related to the compression ratio; higher rates of compression that generate smaller compressed signals 150 tend to negatively affect the quality of the compressed signals 150.
A discrete transform is performed on the uncompressed signal 102 using a transform method 122 employed by the transform coder 110 and in accordance with the compression parameters 104 to generate a vector of transform coefficients 124. Zero and near-zero coefficients may be discarded, and small coefficients may be quantized, concentrating meaningful information from the uncompressed signal 102 into a subset of transform coefficients 124. After the discrete transform of signal 102 is performed, the transformed signal 130 may then be passed to an entropy encoder 140 that encodes the transform coefficients 124 to generate compressed signal 150. Compressed signal 150 typically goes to a destination 160, e.g. a compressed file or compressed audio or video stream.
In many applications, given an input uncompressed signal 102, an operator or user of a transform coder 110 may desire or need to know the size of the compressed signal 150. For example, a user may have a large image in an image editing application that the user wants to compress and save. The user may have a necessary or desired target size, and may want the compressed signal 150 to be close to that size. Conventionally, determining the compressed size is performed by actually performing the compression to generate the compressed signal 150. If the compressed signal 150 is too large or too small, the user may then adjust the compression parameters 104 and perform the compression again. Thus, conventionally, determining the size of the compressed signal 150 is performed by trial and error, which is time- and compute-intensive.
Some techniques have been developed for predicting the compressed size of a signal that do not require the signal to be fully compressed by the transform coder 110. However, these techniques tend to be designed for specific transform coders, particularly for predicting the performance of specific entropy encoders 140.