1. Field of the Invention
This invention relates to a speech encoding method and apparatus for encoding speech signals by digital signal processing with high efficiency.
2. Description of the Related Art
Recently, a speech encoding method with a low bit rate of the order of 4.8 to 9.6 kbps, for example, applicable to a car telephone, a portable telephone or to television telephone, has been developed. A code excited linear prediction (CELP) encoding method, such as vector sum excited linear prediction (VSELP) encoding method, has been proposed as the speech encoding method . There is also proposed, a so-called half-rate speech encoding method, having a halved bit rate, such as a bit rate on the order of 3.45 kbps, CELP encoding with pitch synchronization processing, that is a so-called pitch synchronous innovation- CELP (PSI-CELP), has been proposed.
This PSI-CELP encoding method is of a CELP type encoding system and includes, a codebook for excited code vector as an excitation source, an adaptive codebook for long-term prediction, a fixed codebook and a noise codebook. The PSI-CELP encoding method is characterized in that the noise codebook is rendered periodic in association with the pitch period lag of the adaptive code vector. The pitch synchronization of the noise codebook is realized by taking out the speech corresponding to a pitch period, as the basic speech period, from the leading end of the noise codebook, and by modifying the speech thus taken out into a repetitive form for improving the quality of the voiced portion. Also, with the PSI-CELP, it is aimed to improve the expressive character of the non-periodic speech by switching between the adaptive codebook and the fixed codebook.
With the above-described PSI-CELP, the voiced speech and the unvoiced speech are effectively processed for speech synthesis by selectively switching between the fixed codebook and the adaptive codebook as a long-term predictive filter responsive to input signals. However, if frequency components of the voiced speech are significantly changed between forward and backward sub-frames, the fixed codebook is predominantly selected, thus impairing continuity of the decoded speech and possibly producing waveform distortion.
In selecting the code vector of the adaptive codebook and the fixed codebook, candidates exhibiting the strongest correlation with the input signals are selected. For example, if the input speech is changed from the speech containing many high-frequency components to the speech where the specified low frequency range is predominant, the state of the adaptive codebook of the long-term prediction filter cannot follow up with such changes, as a result of which the fixed codebook exhibiting strong correlation is predominantly selected. However, on decoding, speech continuity is impaired significantly, such that waveform distortion is produced in the worse case.