The present invention relates generally to speech coding techniques, and more specifically to a speech conversion system using a low-rate linear prediction speech coding/decoding technique.
As described in a paper by M. Schroeder and B. Atal, "Code-excited linear prediction: High quality speech at very low bit rates", M. Schroeder and B. Atal (ICASSP Vol. 3, pages 937-940, March 1985), speech samples digitized at 8-kHz sampling rate are converted to digital samples of 4.8 to 8 kbps rates by extracting spectral parameters representing the spectral envelope of the speech samples from frames at 20-ms intervals and deriving pitch parameters representing the long-term correlations of pitch intervals from subframes at 50-ms intervals. Fricative components of speech are stored in a codebook. Using the pitch parameter a search is made through the codebook for an optimum value that minimizes the difference between the input speech samples and speech samples which are synthesized from a sum of the optimum codebook values and the pitch parameters. Signals indicating the spectral parameter, pitch parameter, and codebook value are transmitted or stored as index signals at bit rates in the range between 4.8 and 8 kbps.
However, one disadvantage of linear prediction coding is that it requires a large amount of computations for analyzing voiced sounds, an amount that exceeds the capability of the state-of-the-art hardware implementation such as 16-bit fixed point DSP (digital signal processing) LSI packages. With the current technology, LPC analysis is not satisfactory for high-pitched voiced sounds.