1. Field of the Invention
The present invention generally relates to systems that encode audio signals, such as speech signals, for transmission or storage and/or that decode encoded audio signals for playback.
2. Background
Speech coding refers to the application of data compression to audio signals that contain speech, which are referred to herein as “speech signals.” In speech coding, a “coder” encodes an input speech signal into a digital bit stream for transmission or storage, and a “decoder” decodes the bit stream into an output speech signal. The combination of the coder and the decoder is called a “codec.” The goal of speech coding is usually to reduce the encoding bit rate while maintaining a certain degree of speech quality. For this reason, speech coding is sometimes referred to as “speech compression” or “voice compression.”
The encoding of a speech signal typically involves applying signal processing techniques to estimate parameters that model the speech signal. In many coders, the speech signal is processed as a series of time-domain segments, often referred to as “frames” or “sub-frames,” and a new set of parameters is calculated for each segment. Data compression algorithms are then utilized to represent the parameters associated with each segment in a compact bit stream. Different codecs may utilize different parameters to model the speech signal. By way of example, the BROADVOICE16™ (“BV16”) codec, which is described by J.-H. Chen and J. Thyssen in “The BroadVoice Speech Coding Algorithm,” Proceedings of 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. IV-537-IV-540, April 2007, is a two-stage noise feedback codec that encodes Line-Spectrum Pair (LSP) parameters, a pitch period, three pitch taps, excitation gain and excitation vectors associated with each 5 ms frame of an audio signal. Other codecs may encode different parameters.
As noted above, the goal of speech coding is usually to reduce the encoding bit rate while maintaining a certain degree of speech quality. There are many practical reasons for seeking to reduce the encoding bit rate. Motivating factors may include, for example, the conservation of bandwidth in a two-way speech communication scenario or the reduction of memory requirements in an application that stores encoded speech for subsequent playback. To this end, codec designers are often tasked with reducing the number of bits required to generate an encoded representation of each speech signal segment. This may involve modifying the way a codec models speech and/or reducing the number of bits used to represent one or more parameters associated with a particular speech model. However, to provide a decoded speech signal of good or even reasonable quality, there are limits to the amount of compression that can be applied by any coder. What is needed, then, are techniques for reducing the encoding bit rate of a codec in a manner that will result in relatively little degradation of a decoded speech signal generated by the codec.