Quantization is the process of approximating a continuous or quasi-continuous (digital but relatively high-resolving) range of values by a discrete set of values. Simple examples of quantization include rounding of a real number to an integer, bit-depth transition and analogue-to-digital conversion. In the latter case, an analogue signal is expressed in terms of digital reference levels. Integer quantization indices may be used for labelling the reference levels. As used herein, quantization does not necessarily include changing the time resolution of the signal, such as by sampling or downsampling it with respect to time.
Quasi-continuous numbers, such as those at formed at the output of an analogue-to-digital converter, are commonly quantized to enable transmission over a communication network at a relatively low rate. The reconstruction step at the receiving end consists of the decoding of the quantization index to a quasi-continuous representation. This decoded representation may form the input to an digital-to-analogue converter. However, at least if a moderate number of reference levels are applied, perceptible quantization noise and artefacts may occur in the reconstructed signal. In transform-based quantization of audio signals, where the source signal is decomposed into frequency components, the reconstructed signal may exhibit ‘birdies’, an unpleasant artefact which is perceived somewhat like the sound of running water. In a spectrogram, ‘birdies’ may have the appearance of islands, that is, weak frequency components surrounded by other components which due to quantization are encoded with zero power intermittently. In a spectrogram, a time-frequency plot of the signal power, the non-zero episodes may occupy isolated areas, reminiscent of islands.
The above problem—and possibly other drawbacks associated with quantization—may be mitigated by increasing the bit rate. However, considering that expected savings in bandwidth and storage is one of the main motivations for quantization, this rather circumvents than solves the problem.
An approach to make quantizers efficient is to optimize the quantizer resolution to minimize the average distortion given a fixed rate or given an average rate. For fixed-rate coders this leads to a variable quantization resolution whereas for variable-rate coders this leads to an asymptotically uniform resolution.
Dithering, that is, adding stochastic noise in connection with the reconstruction of the signal, may improve the audible impression, even though it increases the mean squared error. Indeed, it has been established that some artefacts are associated with an unintended statistical correlation between the quantization error and the source signal value, which all the more perceptible the more the error repeats. The dithering noise however alienates the source signal from the reconstructed signal in terms of probability densities, and there is no theoretical upper bound on the difference.
In addition to these attempts to improve the quantization itself, the field of audio technology offers several techniques for removing the ‘birdies’ artefact a posteriori: band limitation (see M. Erne, “Perceptual audio coders ‘what to listen for’”, 111th Convention of the Audio Engineering Society, September 2001), a regularization method for tonal-like signals (see L. Daudet and M. Sandler, “MDCT analysis of sinusoids: exact results and applications to coding artifacts reduction”, IEEE Transactions on Speech and Audio Processing, vol. 12, no. 3, May 2004) and noise fill (see S. A. Ramprashad, “High quality embedded wideband speech coding using an inherently layered coding paradigm,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP '00, vol. 2, June 2000).
On the one hand, it is well-known that low-rate video coding results in artifacts such as blurriness, ringing, and blocking. On the other hand, a high-perceived quality texture of video objects can be created by means of statistical parametric models (see, e.g., J. Portilla and E. P. Simoncelli, “A parametric texture model based on joint statistics of complex wavelet coefficients”, International Journal of Computer Vision, vol. 40, no. 1, pp. 49-71, 2000). However, high-quality parametric models do not provide an exact description of the original image and no certainty exists about their perceived accuracy.