1. Field of the Invention
The present invention relates to a speech encoding method and speech decoding method which are used to compression-encode and decode speech signals, audio signals, and the like.
2. Description of the Background Art
As a method of compression-encoding speech signals, a CELP (Code-Excited Linear Prediction) scheme is known (“Code-Excited Linear Prediction (CELP): High-quality Speech at Very Low Rates “Rroc. ICASSP ′85, 25, 1.1 pp. 937-940, 1985).
According to characteristic features of the CELP scheme, modeling of a speech signal is performed separately for a synthesis filter and an excitation signal for driving the synthesis filter, and distortion is evaluated in accordance with the level of a perceptually weighted speech signal in encoding the excitation signal, thereby making it difficult to perceive encoding distortion. A synthesized speech signal after encoding is generated by passing the excitation signal through the synthesis filter. The excitation signal is generated by combining two code vectors, i.e., an adaptive code vector generated from an adaptive codebook storing past excitation signals and a stochastic vector generated from a stochastic codebook.
An adaptive code vector mainly represents repetition of a waveform based on a pitch period as a feature of an excitation signal in a voiced speech interval. A stochastic code vector contains a component for compensating for a component contained in an excitation signal which cannot be expressed by an adaptive code vector, and is used to make a synthesized speech signal more natural.
An adaptive codebook is a codebook using the fact that a repeating waveform based on a pitch period of an excitation signal is similar to the repeating waveform of an immediately preceding excitation signal. More specifically, past excitation signals are stored in the adaptive codebook without any changes, and a past excitation signal is extracted from the adaptive codebook by an amount corresponding to a pitch period. The vector obtained by repeating the extracted signal with a pitch interval at a pitch period up to a signal interval is used as an adaptive code vector. As described above, according to the conventional adaptive codebook, the current adaptive code vector is obtained by directly repeating an excitation signal used in the past. In this conventional method, if the encoding bit rate is decreased to about 4 kbits/s, since an insufficient number of bits are assigned to express an excitation signal, distortion due to encoding is clearly perceived. As a consequence, the speech becomes unclear or noisy. That is, the sound quality considerably deteriorates. Demands have therefore arisen for a high-efficiency encoding scheme that can generate synthesized speech with high quality even if the bit rate is decreased.
As described above, in the conventional speech encoding method, it is difficult to obtain synthesized speech with high quality at a low bit rate.
It is an object of the present invention to provide a speech encoding method/speech decoding method which can generate synthesized speech with high quality even at a low bit rate.