1. Field of the Invention
This invention relates generally to digital communications, and more particularly, to digital coding (or compression) of speech and/or audio signals.
2. Related Art
In speech or audio coding, the coder encodes the input speech or audio signal into a digital bit stream for transmission or storage, and the decoder decodes the bit stream into an output speech or audio signal. The combination of the coder and the decoder is called a codec.
In the field of speech coding, predictive coding is a very popular technique. Prediction of the input waveform is used to remove redundancy from the waveform, and instead of quantizing an input speech waveform directly, a residual signal waveform is quantized. The predictor(s) used in predictive coding can be either backward adaptive or forward adaptive predictors. Backward adaptive predictors do not require any side information as they are derived from a previously quantized waveform, and therefore can be derived at a decoder. On the other hand, forward adaptive predictor(s) require side information to be transmitted to the decoder as they are derived from the input waveform, which is not available at the decoder.
In the field of speech coding, two types of predictors are commonly used. A first type of predictor is called a short-term predictor. It is aimed at removing redundancy between nearby samples in the input waveform. This is equivalent to removing a spectral envelope of the input waveform. A second type of predictor is often referred as a long-term predictor. It removes redundancy between samples further apart, typically spaced by a time difference that is constant for a suitable duration. For speech, this time difference is typically equivalent to a local pitch period of the speech signal, and consequently the long-term predictor is often referred as a pitch predictor. The long-term predictor removes a harmonic structure of the input waveform. A residual signal remaining after the removal of redundancy by the predictor(s) is quantized along with any information needed to reconstruct the predictor(s) at the decoder.
This quantization of the residual signal provides a series of bits representing a compressed version of the residual signal. This compressed version of the residual signal is often denoted the excitation signal and is used to reconstruct an approximation of the input waveform at the decoder in combination with the predictor(s). Generating the series of bits representing the excitation signal is commonly denoted excitation quantization and generally requires the search for, and selection of, a best or preferred candidate excitation among a set of candidate excitations with respect to some cost function. The search and selection require a number of mathematical operations to be performed, which translates into a certain computational complexity when the operations are implemented on a signal processing device. It is advantageous to minimize the number of mathematical operations in order to minimize a power consumption, and maximize a processing bandwidth, of the signal processing device.
Excitation quantization in predictive coding can be based on a sample-by-sample quantization of the excitation. This is referred to as Scalar Quantization (SQ). Techniques for performing Scalar Quantization of the excitation are relatively simple, and thus, the computational complexity associated with SQ is relatively manageable.
Alternatively, the excitation can be quantized based on groups of samples. Quantizing groups of samples is often referred to as Vector Quantization (VQ), and when applied to the excitation, simply as excitation VQ. The use of VQ can provide superior performance to SQ, and may be necessary when the number of coding bits per residual signal sample becomes small (typically less than two bits per sample). Also, VQ can provide a greater flexibility in bit-allocation as compared to SQ, since a fractional number of bits per sample can be used. However, excitation VQ can be relatively complex when compared to excitation SQ. Therefore, there is need to reduce the complexity of excitation VQ as used in a predictive coding environment.
One type of predictive coding is Noise Feedback Coding (NFC), wherein noise feedback filtering is used to shape coding noise, in order to improve a perceptual quality of quantized speech. Therefore, it would be advantageous to use excitation VQ with noise feedback coding, and further, to do so in a computationally efficient manner.