1. Field of the Invention
The present invention relates to a speech coder for the high quality coding of an input speech signal at low bit rates.
2. Related Art
Well-known systems for high quality coding input speech signal, CELP (Code Excited Linear Predictive Coding), are disclosed in (i) M. Schroeder and B. Atal, "Code-Excited Linear Prediction: High Quality Speech At Very Low Bit Rates", Proc. ICASSP, pp. 937-940, 1985 (hereinafter referred to as "Literature 1"), and (ii) Kleij et al; "Improved speech quality and efficient vector quantization in SELP", Proc. ICASSP, pp. 155-158, 1988 (hereinafter referred to as "Literature 2").
On the transmitting side of such a coding system, spectral parameters representing spectral characteristics of a speech signal are extracted from the same kusing linear predictive (LPC) analysis of a predetermined degree (for instance, the 10-th degree), and quantized to provide quantized parameters. Each frame of the speech signal is divided into a plurality of sub-frames (for instance of 5 ms) and codebook parameters (a delay parameter and a gain parameter corresponding to the pitch cycle) are extracted for each sub-frame on the basis of a past excitation signal in accordance with the spectral parameters. In addition, a sub-frame speech signal is predicted using pitch prediction with reference to an adaptive codebook.
The excitation signal thus obtained through the pitch prediction is then quantized by selecting an optimum excitation codevector from an excitation codebook (or vector quantization codebook) which is constituted by predetermined kinds of noise signals and by calculating an optimum gain. The selection of the excitation codevector is performed such that error power is minimized between a signal synthesized from the selected noise signals and a residue signal. An index indicative of the kind of codevector selected, a gain, quantized spectral parameters and extracted adaptive codebook parameters are multiplexed in a multiplexer and the resultant multiplexed data is transmitted. The receiving side is not described.
The method of improving the analysis accuracy of the speech signal spectral parameter on the basis of the CELP, has already been proposed. Indeed, on the transmitting side, spectral parameters of a reproduced speech signal are developed by analyzing past reproduced speech signals to a higher degree than is conventional and are used to quantize the speech. This method is known as LD-CELP (Low-Delay CELP) and is described in, for instance, J-H Chen et al, "A Low-Delay CELP Coder For The CCITT 16 kb/s Speech Coding Standard", IEEE Journal of Selected Areas on Communications, vol. 10, pp. 830-849, June 1992 (hereinafter referred to as "Literature 3"). In a LD-CELP system, on the receiving side as well as the transmitting side, spectral parameters are developed and used based on analysis of the past reproduced speech signal. This provided an advantage in that no spectral parameter needed to be transmitted even when the degree of analysis is greatly increased.
Such a well-known speech coding/decoding method is disclosed in, for example, Laid-Open Patent 4-344699.
In the speech coding methods disclosed in Literatures 1 and 2, since the spectral parameters are analyzed with a constant degree (for example, the 10-degree) for each frame, if the analysis degree is increased by two times (i.e., to for example, the 20-degree) in order to increase the spectral analysis degree, twice the number of transmission bits are required, thereby increasing the bit rate.
The speech coding method disclosed in Literature 3 requires that the analysis degree be increased to transmit the speech parameters. The spectral parameter matching is degraded at portions where the signal characteristic is changed with time, thereby degrading the characteristic and speech quality. This is due to the use of the spectral parameters analyzed from the past produced signal. In particular, the increase in the analysis degree degrades the matching characteristic of the reproduced signal developed on the transmission side and the reproduced signal on the received side. Therefore, when error is caused on the transmission side, the speech quality on the receiving side is remarkably degraded because of mismatching between the reproduced signal obtained from the reproduced signals on the transmission side and the receiving side.