General purpose perceptual audio coders achieve relatively high coding gains by using transforms such as the Modified Discrete Cosine Transform (MDCT) with block sizes of samples which cover several tenths of milliseconds (e.g. 20 ms). An example for such a transform-based audio codec system is Advanced Audio Coding (AAC) or High Efficiency (HE)-AAC. However, when using such transform-based audio codec systems for voice signals, the quality of voice signals degrades faster than that of musical signals towards lower bitrates, especially in the case of dry (non-reverberant) speech signals.
The present document describes a transform-based audio codec system which is particularly well suited for the coding of speech signals. Furthermore, the present document describes a quantization schemes which may be used in such a transform-based audio codec system. Various different quantization schemes may be used in conjunction with transform-based audio codec systems. Examples are vector quantization (e.g., Twin vector quantization), distribution preserving quantization, dithered quantization, scalar quantization with a random offset, and scalar quantization combined with a noise-fill (e.g., the quantizer described in U.S. Pat. No. 7,447,631). These different quantization schemes have various advantages and disadvantages with regards to one or more of the following attributes:                operational (encoder) complexity, which typically includes the computational complexity of quantization and of generation of the bitstream (e.g., variable length coding);        perceptual performance, which may be estimated based on theoretical considerations (rate-distortion performance) and based on features of the associated noise-filling behavior (e.g. at bit-rates that are practically relevant to low-rate transform coding of speech);        complexity of the bit-rate allocation process in the presence of an overall bit-rate constraint (e.g., maximum number of bits); and/or        flexibility with regards to enabling different data-rates and different distortion levels.        
In the present document, a quantization scheme is described which addresses at least some of the above mentioned attributes. In particular, a quantization scheme is described which provides improved performance with regards to some or all of the above mentioned attributes.