Efficient communication of speech information often involves the coding of speech signals for transmission over a channel or network ("channel"). Speech coding can provide data compression useful for communication over a channel of limited bandwidth. Speech coding systems include a coding process which converts speech signals into code words for transmission over the channel, and a decoding process which reconstructs speech from received code words.
A goal of most speech coding techniques is to provide faithful reproduction of original speech sounds such as, e.g., voiced speech, produced when the vocal cords are tensed and vibrating quasi-periodically. In the time domain, a voiced speech signal appears as a succession of similar but slowly evolving waveforms referred to as pitch-cycles. A single one of these pitch-cycles has a duration referred to as the pitch-period.
In analysis-by-synthesis speech coding systems employing longterm predictors (LTPs), such as most code-excited linear predictive (CELP) speech coding known in the art, a frame (or subframe) of coded pitch-cycles is reconstructed by a decoder in part through the use of past pitch-cycle data by the decoder's LTP. A typical LTP may be interpreted as an all-pole filter providing delayed fedback of past pitch-cycle data, or an adaptive codebook of overlapping vectors of past pitch-cycle data. Past pitch-cycle data works as an approximation of present pitch-cycles to be decoded. A fixed codebook (e.g. a stochastic codebook) may be used to refine past pitch-cycle data to reflect details of the present pitch-cycles.
Analysis-by-synthesis coding systems like CELP, while providing low bit-rate coding, may not communicate enough information to completely describe the evolution of the pitch-cycle waveform shapes in original speech. If the evolution (or dynamics) of a succession of pitch-cycle waveforms in original speech is not preserved in reconstructed speech, audible distortion may be the result.