1. Technical Field
The present invention relates generally to speech encoding and decoding in voice communication systems; and, more particularly, it relates to various techniques used with code-excited linear prediction coding to obtain high quality speech reproduction through a limited bit rate communication channel.
2. Related Art
Signal modeling and parameter estimation play significant roles in communicating voice information with limited bandwidth constraints. To model basic speech sounds, speech signals are sampled as a discrete waveform to be digitally processed. In one type of signal coding technique called LPC (linear predictive coding), the signal value at any particular time index is modeled as a linear function of previous values. A subsequent signal is thus linearly predictable according to an earlier value. As a result, efficient signal representations can be determined by estimating and applying certain prediction parameters to represent the signal,
Applying LPC techniques, a conventional source encoder operates on speech signals to extract modeling and parameter information for communication to a conventional source decoder via a communication channel. Once received, the decoder attempts to reconstruct a counterpart signal for playback that sounds to a human ear like the original speech.
A certain amount of communication channel bandwidth is required to communicate the modeling and parameter information to the decoder. In embodiments, for example where the channel bandwidth is shared and real-time reconstruction is necessary, a reduction in the required bandwidth proves beneficial. However, using conventional modeling techniques, the quality requirements in the reproduced speech limit the reduction of such bandwidth below certain levels.
With CELP type speech coders, mistakes in estimating pitch lag causes degradation in resulting speech quality. In conventional speech coders, such mistakes often occur for example in incorrectly identifying a pitch lag value that is actually double or triple that of the actual pitch lag sought. Similarly, incorrect identification sometimes yields a pitch lag value that is less and even half that of the actual pitch lag sought.
Further limitations and disadvantages of conventional systems will become apparent to one of skill in the art after reviewing the remainder of the present application with reference to the drawings.
Various aspects of the present invention can be found in a speech encoding system using an analysis by synthesis approach on a speech signal that has a previous pitch lag and a current pitch lag. The speech encoding system comprises an adaptive codebook and an encoder processing circuit. The encoder processing circuit identifies a plurality of pitch lag candidates. From these candidates, the encoder processing circuit attempts to identify the current pitch lag by selecting one of the plurality of pitch lag candidates after considering timing relationships between the previous pitch lag and at least one of the plurality of pitch lag candidates.
The encoder processing circuit may also identify integer multiple timing relationships between at least two of the plurality of pitch lag candidates. Such a timing relationship may also be used in the selection of the one of the plurality of pitch lag candidates.
The consideration of the timing relationships between the previous pitch lag and one of the pitch lag candidates may involve favoring that candidate because the favored candidate and the previous pitch lag have at least close to a same value.
In some embodiments, the aforementioned xe2x80x9cfavoringxe2x80x9d involves application of a weighting factor to at least one of the plurality of pitch lag candidates. The pitch lag candidates may be found by applying correlation techniques, and wherein the weighting factor is applied to such correlation.
Further aspects of the present invention can be found in a method used by a speech encoding system that applies an analysis by synthesis coding approach to a speech signal. The method employed may comprise the identification of a plurality of pitch lag candidates. The encoding system also uses an adaptive weighting factor to favor at least one of the pitch lag candidates over at least one other of the pitch lag candidates. One of the plurality of pitch lag candidates is selected as a current pitch lag estimate.
The method may further involve adjustments of the adaptive weighting factor. For example, the encoder system may adjust the adaptive weighting factor if an integer multiple timing relationship is detected between at least two of the plurality of pitch lag candidates. Similarly, adjustments may be made if a timing relationship is detected between the previous pitch lag and any one of the plurality of pitch lag candidates. Moreover, the variations and aspects of the speech encoder system described above may also apply to this method.