There is considerable interest among those in the field of digital signal processing to reduce the amount of information required to transmit or store a digital signal intended for human perception without degrading its subjective quality. Informational requirements of digital signals can be reduced by quantizing the signal with fewer binary bits, but quantizing with fewer bits increases errors in the encoded representation. If the magnitude of these errors is large enough, the quantizing inaccuracy is perceptible.
Throughout this discussion, the terms "quantizing" and "quantization" refer to the process of representing signal information as a discrete value. In applications using binary representations, quantized values are expressed by binary bits. Although binary representations are assumed herein for ease of discussion, it should be realized that the present invention and the problems it solves are not limited to binary representations. The terms "dequantizing" and "dequantization" refer to the process of obtaining a representation of the signal information conveyed by quantized values. The terms "quantizer" and "dequantizer" refer to means for performing quantization and dequantization, respectively. The quantizing "step size" is the smallest interval between quantized values and it is inversely related to the number of bits used to represent quantized values. The "quantizing error" is the magnitude of the difference between a quantized value and the value of the signal information it represents and is directly related to the quantizing step size.
In audio coding applications, quantizing errors may manifest themselves as noise. If the quantizing step size is too large, the quantizing noise will be audible and the subjective quality of the encoded signal will be degraded.
"Split-band" coding techniques such as subband coding and transform coding claim to reduce informational requirements of audio signals without any audible degradation by exploiting various psychoacoustic effects such as psychoacoustic masking. See generally, the Audio Engineering Handbook, K. Blair Benson ed., McGraw-Hill, San Francisco, 1988, pages 1.40-1.42 and 4.8-4.10. Such split-band techniques exploit a characteristic of human hearing; a stronger signal may mask or render inaudible a weaker signal if the two signals are sufficiently close in frequency. By splitting an audio signal into narrow frequency bands and quantizing the signal energy in each band, the aural effect of the quantizing noise will be confined to the same frequency band as the quantized spectral energy. By using a separate quantizing step size for each frequency band, the quantizing noise can be kept just small enough so that it is masked by the spectral energy. A common technique used in split-band coders for keeping quantizing noise small enough is to adjust the quantizing step size in each frequency band according to the amplitude of the signal energy in the respective frequency band.
Subband coders generate samples for each frequency subband of the input signal. A subband coder ideally quantizes subband samples using the fewest number of bits possible such that the quantizing noise in each subband is masked by the signal energy in that subband and in neighboring subbands.
Transform coders generate a block of short-time frequency-domain coefficients for each time interval of the input signal. A transform coder ideally quantizes each coefficient using the fewest number of bits possible such that quantizing noise for each coefficient is masked by the signal energy in that coefficient and in neighboring coefficients.
Many split-band encoders encode a 20 kHz bandwidth signal, sampled at a rate in excess of 44 kilosamples per second, into a digital signal of no more than 128 kilobits per second. This bit rate implies that an average of less than 3 bits are used to quantize each subband sample and transform coefficient for subband and transform coders, respectively.
Signal information quantized with 3 bits may have any one of 2.sup.3 or eight discrete values. A 3-bit value expressed in a linear binary representation known as "two's complement" can have any one of eight quantizer output values -4, -3, -2, -1, 0, 1, 2 and 3. The output q(x) of a 3-bit two's complement quantizer changes by an amount equal to the quantizer step size as the input value x crosses quantizing "thresholds." As shown in FIG. 3, for example, the quantizing thresholds may be established at -3.5, -2.5, -1.5, -0.5, 0.5, 1.5 and 2.5. The quantizing function shown in FIG. 3 is asymmetric, that is, the output of this two's complement quantizer is not symmetric about zero because a 3-bit quantized value can have any of four negative values, three positive values, or zero.
The function q(X) shown in FIG. 3 is equal to the value of x rounded to the nearest integer. It is a rounding function and is only one example of a two's complement quantizing function. Many other quantizing functions are possible. For example, a truncating function can be obtained by simply shifting the stair-step function shown in FIG. 3 by 0.5 to the right along the x axis. Although the following discusses only a few quantizing functions, it should be appreciated that the principles and concepts discussed are applicable to a wide range of quantizing functions.
Asymmetric quantizers inherently introduce a bias into the quantized values. This effect may be more easily appreciated by considering the effect of quantizing an unbiased time-domain sinusoid signal x(t) with a 1-bit two's complement quantizer having a quantizing threshold at -0.5 as shown in FIGS. 5a-5c. Quantizer 504 generates a biased discrete representation q(t) along path 506 in response to unbiased signal x(t) received along path 502. It should be pointed out that the signals x(t) and q(t) shown in FIGS. 5a and 5c, and in various other figures referred to herein, are represented as continuous-time signals. These figures are intended to represent only the envelope of discrete-time signals.
The portion of signal x(t) below the quantizing threshold is quantized with a value of -1 and the remainder of signal x(t) is quantized with a value of zero. The quantized output q(t) is a square wave having a bias or average Value less than zero. The value of the bias depends upon the amplitude of signal x(t) relative to the quantizing threshold. The biasing effect of asymmetric quantizers also applies to signals in other domains such as the frequency domain.
The biasing effect of asymmetric quantizers is not desirable in split-band coders because it can significantly distort the spectral energy content of the encoded signal. Such distortion cannot be easily removed by a companion split-band decoder.
Symmetric quantizers do not introduce bias into quantized values. A two's complement 3-bit quantizer can be made symmetric by adjusting the quantizer output values and quantizing thresholds. For example, the asymmetric quantizing function shown in FIG. 3 can be made symmetric by adjusting the quantizing output values to be -3.5, -2.5, -1.5, -0.5, 0.5, 1.5, 2.5 and 3.5, and by adjusting the quantizer thresholds to be -3, -2, -1, 0, 1, 2 and 3. The result of these adjustments is shown in FIG. 4. By using such a quantizing function in a 1-bit symmetric quantizer with a quantizer threshold at 0 as shown in FIGS. 6a-6c, an unbiased sinusoid signal is quantized into an unbiased square wave which alternates between the values -0.5 and 0.5.
Unfortunately, such a symmetric quantizer generates substantial quantizing noise for very low level signals. This effect in the time domain is illustrated in FIGS. 6a-6c. Quantizer 604 generates a relatively large amplitude representation q(t) along path 606 in response to the relatively small amplitude signal x(t) received along path 602. The quantized output q(t) for such a small amplitude signal x(t) is the same as that for a signal with a much larger amplitude, for example, one whose amplitude ranges from -0.99 to 0.99. This effect is applicable to signals in other domains such as the frequency;domain. For many applications, a quantizer should ideally produce an output equal to or approximately equal to zero for very small amplitude signals.
Dithering is one technique which is sometimes used to compensate for adverse affects of symmetric quantizers by randomizing the quantizing error. Dithering can also improve the resolution of small values. Such dithering adds a random-valued component to the signal value prior to quantization; the amplitude of the random-valued component is generally on the order of the quantizing step size. But dithering is not desirable in situations where there is little or no signal energy because a random noise-like signal is created where essentially no signal existed before.
Dithering may also be used to advantageous effect in a split-band decoder after dequantization because it also tends to randomize quantizing errors, producing a noise-like effect which is often less obtrusive than that caused by the quantizing errors alone. In transform decoders, for example, quantizing errors tend to produce tone-like components because the quantizing errors are in the frequency domain. Dithering in a split-band decoder is not always desirable because, unless corrective measures are taken, it will significantly increase the apparent level of quantizing noise for values which are much smaller than the quantizing step size. Dithering should be used in a decoder only in situations where the signal magnitude is not substantially less than the quantizing step size, but without "side information," this signal information is available only in the encoder. Generally the use of such "side information" is not desirable because the bits required to carry the side information increases the informational requirements of the coded signal.