1. Field of the Invention
The present invention relates to a method of and an apparatus for coding a speech signal with high quality at a low bit rate.
2. Description of the Related Art
Various processes have been proposed for coding speech signals highly efficiently. For example, one such process is disclosed in M. Schroeder and B. Atal "Code--excited linear prediction: High quality speech at very low bit rates" (Proc. ICASSP, pp. 937-940, 1985, hereinafter referred to as "document 1"). Another process is CELP (Code Excited Linear Predictive Coding) described in Kleijn et al. "Improved speech quality and efficient vector quantization in CELP" (Proc. ICASSP, pp. 155-158, 1988, hereinafter referred to as "document 2").
According to the above conventional proposals, a transmitter extracts spectral parameters representing spectral characteristics of a speech signal from the speech signal in each frame of 20 ms, for example, using linear predictive coding (LPC). Each frame is divided into subframes each of 5 ms, for example, and parameters, i.e., a delay parameter and a gain parameter corresponding to a pitch period, in an adaptive code book are extracted in each subframe based on a past excitation signal, for pitch prediction of the speech signal in the subframes using the adaptive code book. For a excitation signal determined by pitch prediction, an optimum excitation code vector is selected from a excitation code book (vector quantization code book) of noise signals of predetermined type to calculate an optimum gain for thereby quantizing the excitation signal.
The excitation code vector is selected in a manner to minimize any error power between a signal synthesized from a selected noise signal and a residual signal. An index and a gain which indicate the type of the selected code vector, and the spectral parameters and the parameters in the adaptive code book are combined by a multiplexer and transmitted. Details of a receiver will not be described below.
The above conventional speech signal coding process employs linear predictive coding (LPC) for the calculation of spectral parameters. Female speakers with high pitches utter phonemes whose speech formants and pitch frequencies are close to each other. Since such phonemes are strongly affected by pitches, a large error is encountered in the extraction of spectral parameters from the phonemes. If a pitch is extracted using such wrong spectral parameters, then a wrong pitch period results. When a speech signal is coded using those spectral parameters and pitch, the quality of sound of the speech signal is poor for female speakers with high pitch frequencies, especially if the bit rate is low.
One proposed solution has been to determine spectral parameters with a multipulse signal, rather than a white noise signal, assumed as a excitation signal. For example, reference should be made to Singhal and Atal "Optimizing LPC filter parameters for multi-pass extraction" (Proc. ICASSP, pp. 781-784, 1983, hereinafter referred to as "document 3").
For speech signal coding, it is necessary to quantize spectral parameters and excitation signals for transmitting them. To lower the bit rate, the spectral parameters have to be subjected to rough quantization, and cannot be free from effects which the quantization has on the spectral parameters. According to the process revealed in the document 3, any effects which quantization has on spectral parameters and excitation signals are not taken into account, and the performance of speech signal coding is lowered by rough quantization, resulting in a reduction in the quality of sounds uttered by female speakers.