1. Field of the Invention
The present invention relates to an improved technique for quantizing the spectral parameter used in a number of speech and/or audio coding techniques.
2. Brief Description of the Prior Art
The majority of efficient digital speech encoding techniques with good subjective quality/bit rate tradeoffs use a linear prediction model to transmit the time varying spectral information.
One such technique found in several international standards including the G729 ITU-T is the ACELP (Algebraic Code Excited Linear Prediction) [1] technique.
In ACELP like techniques, the sampled speech signal is processed in blocks of L samples called frames. For example, 20 ms is a popular frame duration in many speech encoding systems. This duration translates into L=160 samples for telephone speech (8000 samples/sec), or, into b=320 samples when 7-kHz-wideband speech (16000 samples/sec) is concerned.
Spectral information is transmitted for each frame in the form of quantized spectral parameters derived from the well known linear prediction model of speech [2.3] often called the LPC information.
In prior art related to frames between 10 and 30 ms, the LPC information transmitted per frame relates to a single spectral model.
The accuracy in transmitting the time-varying spectrum with a 10 ms refresh rate is of course better than with a 30 ms refresh rate however the difference is not worth tripling the coding rate.
The present invention circumvents the spectral-accuracy/coding-rate dilemma by combining two techniques, namely: Matrix Quantization used in very-low bitrate applications where LPC models from several frames are quantized simultaneously [4] and an extensions to matrix of inter-frame prediction [5].
References
[1] U.S. Pat. No. 5,444,816 issued Aug. 22, 1995 for an invention entitled "Dynamic Codebook for efficient speech coding based on algebraic code", J-P Adoul & C. Laflamme inventors. PA0 [2] J. D. Markel & A. H. Gray, Jr. "Linear Predication of Speech" Springer Verlag, 1976. PA0 [3] S. Saito & K. Nakata, "Fundamentals of Speech Signal Processing", Academic Press 1985. PA0 [4] C. Tsao and R. Gray, "Matrix Quantizer Design for LPC Speech Using the Generalized Lloyd Algorithm" IEEE trans. ASSP Vol.: 33, No 3, pp 537-545 June 1985. PA0 [5] R. Salami, C. Laflamme, J-P. Adoul and D. Massaloux, "A toll quality 8 Kb/s Speech Codec for the Personal Communications System (PCS)", IEEE transactions of Vehicular Technology, Vol. 43, No. 3, pp 808816, August 94. PA0 (a) forming a matrix, P, whose rows are the N LPC-spectral-model vectors. PA0 (b) removing from F (possibly a constant-matrix term and) a time-varying prediction matrix, P, based on one, or more, previous frames, to obtain a residual matrix R, and PA0 (c) Vector Quantizing said matrix R.