This invention relates to a speech encoding method and a speech encoding system used to encode voice signal in high quality at a low bit rate.
Known as a method of encoding voice signal in high efficiency is CELP (code excited linear predictive coding) described in, for example, M. Schroeder and B. Atal, xe2x80x9cCode-Excited Linear Prediction: High Quality Speech at Very Low Bit Ratesxe2x80x9d, Proc. ICASSP, pp.937-940, 1985 (prior art 1), and Kleij et al., xe2x80x9cImproved Speech Quality and Efficient Vector Quantization in SELPxe2x80x9d, Proc. ICASSP, pp.155-158, 1988 (prior art 2).
In CELP, on the transmission side, for each frame, e.g. 20 ms, spectral parameter to spectral characteristic is extracted from speech signal by using LPC (linear predictive coding) analysis. A frame is further divided into subframes, e.g. 5 ms, and for each subframe, based on past excitation signal, parameters (delay. parameter and gain parameter corresponding to pitch cycle) at adaptive codebook are extracted, and speech signal of the subframe is pitch-predicted by the adaptive codebook. For excitation signal obtained by the pitch-predicting, an optimum sound-source code vector is selected from a sound-source codebook (vector quantization codebook) composed of a predetermined kind of noise signals, and the excitation signal is quantized by calculating optimum gain. The selection of sound-source code vector is conducted so that the error electric power between signal synthesized by the selected noise signal and residual signal can be minimized. Then, the index and gain to indicate the kind of code vector selected, the spectral parameter and the adaptive codebook parameter are combined by a multiplexer and transmitted.
However, in CELP described above, there is a problem that when the delay of adaptive codebook extracted for current subframe is more than an integer times or less than the inverse number of an integer times, where the integer is two or more, the delay of adaptive codebook calculated for the previous subframe, between the previous codebook and current codebook, the delay of adaptive codebook becomes discontinuous and therefore the tone quality deteriorates. The reason is as follows: although the delay of adaptive codebook extracted for current subframe is searched near a pitch cycle calculated from speech signal by a pitch calculator, when the pitch cycle becomes more than an integer times or less than the inverse number of an integer times the delay of adaptive codebook calculated for the previous subframe, the search range of adaptive codebook for the current subframe does not include near the delay of adaptive codebook for the previous subframe. Therefore, between the previous codebook and current codebook, the delay of adaptive codebook becomes discontinuous in the process of time.
Accordingly, it is an object of the invention to provide a speech encoding method and a speech encoding system that the delay of adaptive codebook calculated for each subframe can be prevented from being discontinuous in the process of time.
According to the invention, a speech encoding method, comprises the steps of:
calculating a spectral parameter from speech signal to be input and quantizing the spectral parameter;
calculating delay and gain from excitation signal quantized in the past according to an adaptive codebook and calculating the residual by predicting speech signal, based on a pitch cycle;
quantizing the excitation signal of the speech signal by using the spectral parameter;
quantizing the gain of the excitation signal; and
limiting the search range in searching the pitch cycle based on the delay of adaptive codebook calculated in the past and searching the pitch cycle from the speech signal.
According to another aspect of the invention, a speech encoding method, comprises the steps of:
calculating a spectral parameter from speech signal to be input and quantizing the spectral parameter;
calculating delay and gain from excitation signal quantized in the past according to an adaptive codebook and calculating the residual by predicting speech signal, based on a pitch cycle;
quantizing the excitation signal of the speech signal by using the spectral parameter;
quantizing the gain of the excitation signal;
determining a mode by extracting a characteristic quantity from the speech signal; and
limiting the search range in searching the pitch cycle based on the delay of adaptive codebook calculated in the past and searching the pitch cycle from the speech signal, when the determined mode corresponds to a predetermined mode.
According to another aspect of the invention, a speech encoding system, comprises:
a spectral parameter calculation unit that calculates a spectral parameter from speech signal to be input and quantizes the spectral parameter;
a pitch calculation unit that outputs calculating a pitch cycle from the speech signal;
an adaptive codebook unit that calculates delay and gain from excitation signal quantized in the past according to an adaptive codebook and calculates the residual by predicting speech signal, based on the output of the pitch calculation unit, and that outputs the calculated delay and gain;
a excitation quantization unit that outputs quantizing the excitation signal of the speech signal by using the spectral parameter;
a gain quantization unit that outputs quantizing the gain of the excitation signal; and
a limiter unit that limits the search range in searching the pitch cycle based on the delay of adaptive codebook calculated in the past;
wherein the pitch calculation unit outputs searching the pitch cycle based on the output of the limiter unit.
According to another aspect of the invention, a speech encoding system, comprises:
a spectral parameter calculation unit that calculates a spectral parameter from speech signal to be input and quantizes the spectral parameter;
a pitch calculation unit that outputs calculating a pitch cycle from the speech signal;
an adaptive codebook unit that calculates multiple delays and gain from excitation signal quantized in the past according to an adaptive codebook and calculates the residual by predicting speech signal, based on the output of the pitch calculation unit, and that outputs the calculated delays and gain;
a excitation quantization unit that quantizes the excitation signal of the speech signal for each of the multiple delays by using the spectral parameter and then outputs selecting one with smaller signal distortion;
a gain quantization unit that outputs quantizing the gain of the excitation signal; and
a limiter unit that limits the search range in searching the pitch cycle based on the delay of adaptive codebook calculated in the past;
wherein the pitch calculation unit outputs searching the pitch cycle based on the output of the limiter unit.
According to another aspect of the invention, a speech encoding system, comprises:
a spectral parameter calculation unit that calculates a spectral parameter from speech signal to be input and quantizes the spectral parameter;
a pitch calculation unit that outputs calculating a pitch cycle from the speech signal;
an adaptive codebook unit that calculates delay and gain from excitation signal quantized in the past according to an adaptive codebook and calculates the residual by predicting speech signal, based on the output of the pitch calculation unit, and that outputs the calculated delay and gain;
a excitation quantization unit that outputs quantizing the excitation signal of the speech signal by using the spectral parameter;
a mode determination unit that determines a mode by extracting a characteristic quantity from the speech signal;
a gain quantization unit that outputs quantizing the gain of the excitation signal; and
a limiter unit that limits the search range in searching the pitch cycle based on the delay of adaptive codebook calculated in the past, when the output of the mode determination unit corresponds to a predetermined mode;
wherein the pitch calculation unit outputs searching the pitch cycle based on the output of the limiter unit, when the output of the mode determination unit corresponds to the predetermined mode.
According to another aspect of the invention, a speech encoding system, comprises:
a spectral parameter calculation unit that calculates a spectral parameter from speech signal to be input and quantizes the spectral parameter;
a pitch calculation unit that outputs calculating a pitch cycle from the speech signal;
an adaptive codebook unit that calculates multiple delays and gain from excitation signal quantized in the past according to an adaptive codebook and calculates the residual by predicting speech signal, based on the output of the pitch calculation unit, and that outputs the calculated delays and gain;
a excitation quantization unit that quantizes the excitation signal of the speech signal by using the spectral parameter and then outputs selecting one with smaller signal distortion;
a mode determination unit that determines a mode by extracting a characteristic quantity from the speech signal;
a gain quantization unit that outputs quantizing the gain of the excitation signal; and
a limiter unit that limits the search range in searching the pitch cycle based on the delay of adaptive codebook calculated in the past, when the output of the mode determination unit corresponds to a predetermined mode;
wherein the pitch calculation unit outputs searching the pitch cycle based on the output of the limiter unit, when the output of the mode determination unit corresponds to the predetermined mode.
In this invention, the limiter unit is input with the delay of adaptive codebook obtained for the previous subframe, and the search range of pitch cycle is limited so that the delay of adaptive codebook obtained for the previous subframe is not discontinuous to the delay of adaptive codebook to be obtained for the current subframe, and the search range of pitch cycle limited is output to the pitch calculation unit.
The pitch calculation unit is input with perceptual weighting output signal and the search range of pitch cycle output from the limiter unit, calculating the pitch cycle, then outputting at least one pitch cycle to the adaptive codebook unit. The adaptive codebook unit is input with the perceptual weighting signal, the past excitation signal output from the gain quantization unit, the perceptual weighting impulse response output from the impulse response calculation circuit, and the pitch cycle from the pitch calculation unit, searching near the pitch cycle, calculating the delay of adaptive codebook. By using the above composition, the delay of adaptive codebook obtained for each subframe can be prevented from being discontinuous in the process of time.