This invention relates to a speech parameter encoding device for encoding spectrum parameters of an input speech or voice signal at a low bit rate, such as below 4.8 kb/s.
For use in encoding an input speech signal at a low bit rate of less than 8 kb/s, a code excited LPC coding (CELP) is already known. Examples are disclosed in a paper contributed by M. R. Schroeder and B. S. Atal to the Proceedings of ICASSP, 1985, pages 937 to 940, under the title of "Code-excited Linear Prediction (CELP): High-Quality Speech at Very Low Bit Rates" and in another paper contributed by W. B. Kleijn and two others to the Proceedings of ICASSP, 1988, pages 155 to 158, under the title of "Improved Speech Quality and Efficient Vector Quantization in SELP".
According to the code excited LPC coding, spectrum parameters are extracted from each frame signal of an input speech signal. The frame signal has a frame length which may be 20 milliseconds long. The spectrum parameters represent spectrum characteristics of the input speech signal. The frame signal is divided into subframe signals, each having a subframe length of, for example, 5 milliseconds. Based on the subframe signal of a previous subframe, pitch parameters are extracted to represent a long-time or pitch correlation. Using the pitch parameters in long-term predicting the subframe signals, residue signals are calculated. Code books are used to define noise signals of predetermined kinds. For the residue signals, noise signals are selected from the code books. One of the predetermined kinds is selected to minimize an error power between the input speech signal and a combination of such noise signals and to calculate an optimum gain. The spectrum parameters and the pitch parameters are transmitted together with the optimum gain and an index indicative of the above-mentioned one of the predetermined kinds.
In the code excited LPC coding, LPC analysis is used in calculating LPC parameters as the spectrum parameters. The LPC parameters are quantized usually in accordance with scalar quantization. When LPC coefficients are used up to a tenth degree for quantization, it is necessary to use a bit number of 34 bits per frame. This bit number results in a bit rate of 1.7 kb/s in encoding only the LPC coefficients. A reduction in the bit number has given rise to a deteriorated quality.
In order to more effectively quantize the LPC parameters, vector-scalar quantization is proposed. An example is revealed in a paper contributed by Takehiro Moriya and another to the IEEE Journal of Selected Areas in Communications, 1988, pages 425 to 431, under the title of "Transform Coding of Speech Using a Weighted Vector Quantizer". Even with this quantization, the bit number must be from 27 to 30 bits.
When a longer frame length is used, a smaller bit number would be used in quantizing the spectrum parameters. This has, however, made it difficult to excellently represent a time variation within the frame in the spectrum characteristics and resulted in an increased distortion and in a deteriorated speech quality.
Later in the following, four other papers will be referred to. One is contributed by Noboru Sugamura and another to the IEEE Journal of Selected Areas in Communications, 1988, pages 432 to 440, under the title of "Quantizer Design in LSP Speech Analysis-Synthesis". Another is contributed by Yoseph Linde and two others to the IEEE Transactions on Communications, 1980, pages 84 to 95, under the title of "An Algorithm for Vector Quantization Design". A like paper is contributed by K. K. Paliwal and another to the IEEE Transactions on Speech and Audio Processing, 1993, pages 3 to 14, under the title of "Efficient Vector Quantization of LPC Parameters at 24 Bits/Frame". Still another is contributed by Chieh Tsao and two others to the IEEE Transactions on ASSP, 1985, pages 537 to 545, under the title of "Matrix Quantizer Design for LPC Speech Using the Generalized Lloyd Algorithm". Yet another is contributed by Laroia and two others to the Proceedings of ICASSP, 1991, pages 641 to 644, under the title of "Robust and Efficient Quantization of Speech LSP Parameters Using Structured Vector Quantizers".