This invention relates to an encoding method and an encoding system for use in effectively encoding a speech signal at a low bit rate which is not higher than eight kilobits/s.
A conventional encoding method of the type described is known as a Code Excited LPC Coding (CELP) method which is disclosed, for example, in a paper which is mentioned by M. Schroeder and B. Atal in Proceedings of ICASSP (1985) (pages 937-940) and which is entitled "Code-Excited Linear Prediction" (will be called Reference 1). Such a method is also disclosed in a paper which is mentioned by Klenijn in Proceedings of ICASSP (1988) (pages 155-158) and which is entitled "Improved Speech Quality and Efficient Vector Quantization in SELP" (will be called Reference 2).
At any rate, each of the conventional methods at first extracts, at a transmitting end, spectrum parameters from a speech signal divided into a plurality of frames each of which has a frame period of, for example, 20 milliseconds. Each of the spectrum parameters specifies a spectrum characteristic of the speech signal. Thereafter, each of the frames is subdivided into a plurality of subframes each of which is shorter than the frame and which is equal, for example, to 5 milliseconds. At every one of the subframes, pitch parameters are extracted to represent a long-term correlation (pitch correlation) on the basis of excitation signal which is calculated in the past. A long-term prediction of the speech signal in each subframe is carried out by the use of the pitch parameters to calculate a residue signal from the long-term prediction. Synthesized signals are produced by the use of random signals, namely, waveform patterns selected from a predetermined species of random signals stored in a code book. Subsequently, a single species of the random signals is selected such that error power becomes minimum between the speech signal and the synthesized signals. In addition, an optimum gain of the single species of the random signals is calculated in relation to the residue signal.
Thereafter, indices which are indicative of both the single species of the random signals and the gain are transmitted together with the spectrum parameters and the pitch parameters.
Herein, effective quantization of the spectrum parameters should be considered in addition to quantization of the excitation signal so as to reduce the bit rate in the CELP method.
In the above-mentioned CELP method, LPC parameters which are calculated by an LPC analysis are quantized as the spectrum parameters by the use of scalar quantization and are specified by LPC coefficients having an order. In this event, such scalar quantization requires thirty-four bits at every frame, namely, 1.7 kb/s as a bit number so as to quantize the tenth order of the LPC coefficients. A further reduction of the above-mentioned bit number results in a deterioration of a speech quality.
In order to more effectively quantize the LPC parameters, a vector-scalar quantization method is proposed by Moriya et al in a paper which is entitled "Transform Coding of Speech Using a Weighted Vector Quantizer" (IEEE J. Sel. Areas, Commun., pages 425-431, 1988) and which will be referred to as Reference 3. However, a bit number from twenty-seven bits to thirty bits is necessary even in the proposed method.
It is possible to decrease the bit number by lengthening each frame length. However, it is difficult to precisely represent a temporal variation of a spectrum as each frame length becomes long, which results in a deterioration of a speech quality.