The present invention relates to speech parameter encoders for high quality speech signal spectrum parameter encoding at low bit rates.
As speech parameter encoding, i.e., encoding of speech signal spectrum parameters at as low a bit rate as 2 kb/s, there has been known VQ-SQ: vector-scalar quantization method using LSP (Line Spectrum Pair) coefficients as spectrum parameters. As for a specific method, it is possible to refer to, for instance, T. Moriya et al "Transform Coding of Speech using a Weighted Vector Quantizer", IEEE J. Sel. Areas, Commun., pp. 425-431, 1988 (Literature 1). In this method, an LSP coefficient obtained as a spectrum parameter for each frame is once quantized and decoded with a previously formed vector quantization codebook, and then an error signal between the original LSP and the quantized decoded LSP is scalar-quantized. As the vector quantization codebook, a codebook is preliminarily formed by training with respect to a large quantity of spectrum parameter data bases such that it comprises 2.sup.B (B being the number of kits for spectrum parameter quantization) different codevectors. As for the training method of codebook, it is possible to refer to, for instance, Linde et al., "An Algorithm for Vector Quantization Design", IEEE Trans. COM-28, pp. 84-95, 1980 (Literature 2).
Further, as a more efficient well-known encoding method, there is a split vector quantization method, in which the dimensions (for instance 10 dimensions) of the LSP parameter is divided into a plurality of divisions (each of 5 dimensions, for instance), and a vector quantization codebook is searched for the quantization for each division. For the details of this method, it is possible to refer to, for instance, K. K. Paliwal et al., "Efficient Vector Quantization of LPC Parameters at 24 Bits/Frame", IEEE Trans. Speech and Audio Processing, pp. 3-14, 1993 (Literature 3).
In order to reduce the bit rate of the spectrum parameter encoding to be 1 kb/s or less, it is required to reduce the spectrum parameter quantization bit number to 20 bits per frame (with a frame length of 20 ms) or less while holding the distortion due to the spectrum parameter quantization to be within the perceptual limit of auditory sense. In the prior art methods, it has been difficult to do so because of the lack of reflection of auditory sense characteristics by the distortion measure, thus leading to great speech quality deterioration with reduction of the quantization bit number to 20 or less.