A computer processes audio or video information as numbers representing that information. The larger the range of the possible values for the numbers, the higher the quality of the information. Compared to a small range, a large range of values more precisely tracks the original audio or video signal and introduces less distortion from the original. On the other hand, the larger the range of values, the higher the bit-rate for the information. Table 1 shows ranges of values for audio and video information of different quality levels, and corresponding bit-rates.
TABLE 1Ranges of values and bits per value for different quality audio andvideo informationInformation type and qualityRange of valuesBitsVideo image, black and white0 to 1 per pixel1Video image, gray scale0 to 255 per pixel8Video image, “true” color0 to 16,777,215 per pixel24Audio sequence, voice quality0 to 255 per sample8Audio sequence, CD quality0 to 65,535 per sample16
High quality audio or video information has high bit-rate requirements. Although consumers desire high quality information, computers and computer networks often cannot deliver it.
To strike a balance between quality and bit-rate, audio and video processing techniques use quantization. Quantization maps many values in an analog or digital signal to one value. In an analog signal, quantization assigns a number to points in the signal. In a digital input signal with a range of 256 values, quantization can assign instead one of 64 values to each point in the signal. (Values from 0 to 3 in the input signal are assigned to the quantized value 0, values from 4 to 7 are assigned to the quantized value 1, etc.) To reconstruct the original value, the quantized value is multiplied by the quantization factor. (The quantized value 0 reconstructs 0×4=0, the quantized value 1 reconstructs 1×4=4, etc.) In essence, quantization decreases the quality of the signal in order to decrease the bit-rate of the signal. After a value has been quantized, however, the original value cannot always be reconstructed. (If the values from 0 to 3 are assigned to the quantized value 0, for example, on reconstruction it is impossible to determine if the original value was 0, 1, 2, or 3.)
When quantizing an input signal, several factors affect the result. For an analog signal, a dynamic range sets the boundaries of the quantization. Suppose the range of an analog signal stretches from negative infinity to infinity, but almost all information is close to zero. The dynamic range of the quantization focuses the quantization on the range of the signal most likely to yield information. For an input signal already in digital form, the dynamic range is bounded by the lowest and highest possible values.
Within the dynamic range, the number of quantization levels determines the precision with which the quantized signal tracks the original signal, which affects the distortion of the quantized signal from the original. For example, if a dynamic range has 256 quantization levels, each point in an input signal is assigned the closest of the corresponding 256 values. Increasing the number of quantization levels in the same dynamic range increases precision and decreases distortion from the original, but increases bit-rate. Quantization threshold, or step size, is a related factor that measures the distance between quantized values.
The preceding examples describe uniform, scalar, non-adaptive quantization—each point in the input signal is quantized by the same quantization threshold to produce a single quantized output value. Other quantization techniques include non-uniform quantization, vector quantization, and adaptive quantization techniques. Non-uniform quantization techniques apply different quantization thresholds to different ranges of values in the input signal, which allows greater emphasis to be given to ranges with more information value. Vector quantization techniques produce a single output value representing multiple points in the input signal. Adaptive quantization techniques change dynamic range, the number of quantization levels, and/or quantization thresholds to adapt to changes in the input signal or resource availability in the computer or computer network. For more information about quantization and the factors affecting the results of quantization, see Gibson et al., Digital Compression for Multimedia, “Chapter 4: Quantization,” Morgan Kaufman Publishers, Inc., pp. 113-138 (1990).
Some adaptive quantization techniques vary dynamic range while holding constant the number of quantization levels. These techniques adapt to the input signal to maintain a relatively constant degree of quality, and they produce a relatively constant bit-rate output. One goal of these techniques is to minimize distortion between the input signal and quantized output for the number of quantization levels. Another goal is to optimize entropy, or information value, of the quantized output. The entropy of the quantized output predicts how effectively the quantized output will later be compressed in entropy compression.
Entropy is a useful measure, but many applications require exact feedback about the actual bit-rate of the compressed quantized output. For example, consider a streaming media system that delivers compressed audio or video information for unbroken playback. An entropy model of the quantized output does not guarantee that actual bit-rate of compressed output satisfies a target bit-rate. If the actual bit-rate of compressed output is much greater than the target bit-rate, playback is disrupted. On the other hand, if the actual bit-rate of compressed output is much lower than the target bit-rate, the quality of the quantized output is not as good as it could be.
The dependency between actual bit-rate of compressed output and quantization threshold is difficult to precisely express—it depends on complex, non-linear, and dynamic interaction between the entropy of the quantized output and the compression techniques used on the quantized output. The relation changes for different types of data and different compression techniques. Thus, to determine actual bit-rate of compressed, quantized output, the quantized output must be compressed with brute force, computationally expensive and time-consuming operations.
One adaptive quantization technique uses actual bit-rate of compressed output as feedback to find an optimal quantization threshold (highest fidelity to original signal) for a target bit-rate ETGT. For a fixed dynamic range, a binary search quantizer tests candidate quantization thresholds T for a block of input data according to a binary search approach. The process of testing candidate quantization thresholds to find an acceptable quantization threshold is a quantization loop.
The binary search quantizer sets a search range bounded by THIGH=TMAX and TLOW=TMIN. Splitting the search range, the binary search quantizer selects a candidate quantization threshold in the middle TMID=0.5(THIGH+TLOW) and applies it to the data. The quantized output is compressed. If the resulting actual bit-rate EMID is acceptable, the process stops. Otherwise, the search range is halved and the process repeats. The search range is halved by setting THIGH to TMID if the actual bit-rate EMID exceeded the target bit-rate ETGT, or by setting TLOW to TMID if the actual bit-rate EMID fell below the target bit-rate ETGT.
In practice, this process also stops if |ceil(logL(THIGH))−ceil(logL(TLOW))|<1, where L is an implementation-dependent constant and ceil(x) is the smallest integer that is greater than or equal to x. This condition reflects a logarithmic dependency between absolute value of T and subjective perception. At higher values of T, humans are less sensitive to changes in T.
FIG. 1 is a graph showing the results of a quantization loop with a binary search approach (100). FIG. 1 shows a range of quantization thresholds T (110), a range of actual bit-rates EX (120) of compressed output, and a target bit-rate ETGT (130), which is set at 875 bits. The binary search quantizer starts with quantization thresholds 2 and 34, known to be too small and too large, respectively. The binary search quantizer selects the midpoint quantization threshold 18 and measures the actual bit-rate E1 of compression operation. As E1 is far below the target bit-rate ETGT, the quantization threshold 18 becomes the new high bound. The binary search quantizer selects a new midpoint quantization threshold 10, measures the actual bit-rate E2, and makes the quantization threshold 10 the new high bound. This process continues through the quantization thresholds 6 (resulting actual bit-rate E3, too high) and 8 (resulting actual bit-rate E4, too low) before stopping after quantization threshold 7 (resulting actual bit-rate E5, acceptable).
The binary search approach finds an acceptable quantization threshold within a bounded period of time—the process stops when the search range becomes small enough. On the other hand, the binary search technique uses 5-8 loop iterations on average, depending on choice of TMAX, TMIN, L and other implementation details in different encoders. Each iteration involves an expensive computation of actual bit-rate of compressed output quantized according to a candidate quantization threshold. In total, these quantization loop iterations take from 20-80% of encoding time, depending on the encoder used and bit-rate/quality of the data.