Audio encoders and decoders (codecs) are used for a wide variety of applications in communication, multimedia and storage systems. An audio encoder is used for encoding audio signals, like speech, in particular for enabling an efficient transmission or storage of the audio signal, while an audio decoder constructs a synthesized signal based on a received encoded signal.
When implementing codecs, it is thus an aim to save transmission and storage capacity while maintaining a high quality of the synthesized signal. Also robustness in respect of transmission errors is important, especially with mobile and voice over internet protocol (VoIP) applications. On the other hand, the complexity of the codec is limited by the processing power of the application platform.
In a typical speech encoder, the input speech signal is processed in segments, which are called frames. Usually the frame length is 10-30 ms. A lookahead segment of 5-15 ms of the subsequent frame may be available in addition. The frame may further be divided into a number of sub frames. For every frame, the encoder determines a parametric representation of the input signal. The parameters are quantized and transmitted through a communication channel or stored in a storage medium in a digital form. At the receiving end, the decoder constructs synthesized signal based on the received parameters.
The construction of the parameters and the quantization are usually based on codebooks, which contain codevectors optimized for the quantization task. In many cases, higher compression ratios require highly optimized codebooks. Often the performance of a quantizer can be improved for a given compression ratio by using prediction from the previous frame. Such a quantization will be referred to in the following as predictive quantization, in contrast to a non-predictive quantization which does not rely on any information from preceding frames. A predictive quantization exploits a correlation between a current audio frame and at least one neighboring audio frame for obtaining a prediction for the current frame so that for instance only deviations from this prediction have to be encoded, which also requires dedicated codebooks.
Prediction quantization might result in problems, however, in case of errors in transmission or storage. With predictive quantization, a new frame cannot be decoded perfectly, even when received correctly, if at least one preceding frame on which the prediction is based is erroneous. It is therefore possible to use a non-predictive quantization once in a while, in order to prevent long runs of error propagation. For such an occasional non-predictive quantization, which is also referred to as “safety-net” quantization, a codebook selector can be employed for selecting between predictive and non-predictive codebooks.