In such fields as digital communication, packet communication typified by Internet communication, and speech storage, speech signal encoders are used to compress speech information so as to make efficient use of radio wave transmission path capacity and storage media and thus encoding at high efficiency.
Among these, methods based on the CELP (Code Excited Linear Prediction) method are widely used at intermediate and low rates in practice. A CELP technique that uses pulse excitation as a drive excitation signal is described in “Code-Excited Linear Prediction (CELP): High-quality Speech at Very Low Bit Rates” by M. R. Schroeder and B. S Atal, Proc. ICASSP-85, 25.1.1., pp.937-940, 1985.
In a CELP type speech encoding method, a digitized speech signal is divided into frames of a fixed frame length (approximately 5 ms-50 ms), linear prediction of speech is performed on a per frame basis, and linear prediction residual (excitation signal) from the linear prediction performed on a per frame basis is encoded using an adaptive codebook and a fixed codebook (including a stochastic codebook, random codebook, noise codebook and so on) composed of known waveforms.
The adaptive codebook holds drive excitation signals generated in the past and is used to represent a cyclic component of a speech signal. The fixed codebook holds a predetermined number of vectors, provided in advance and having predetermined shapes, and is chiefly used to represent a non-cyclic component that cannot be represented with the adaptive codebook.
As for the vectors stored in the fixed codebook, vectors composed of random noise sequence and/or vectors represented by combining a number of pulses are used.
A typical example of a fixed codebook that represents a vector by combining a number of pulses is the algebraic fixed codebook. The algebraic fixed codebook is described in detail, for example, in ITU-T Recommendation G.729 Annex-D. The algebraic fixed codebook has the advantage of searching a fixed excitation codebook at a small computation amount and reducing the capacity in ROM that holds excitation vectors. Still, the problem regarding difficulty of accurate code representation of a noise component persists.
One method for solving this problem with the algebraic fixed codebook is the technique of using a pulse dispersiondispersion technique. Pulse dispersiondispersion is disclosed in ITU-T Recommendation G.729 Annex-D. This pulse dispersiondispersion is a method for generating a fixed excitation vector by convoluting a dispersiondispersion pattern (fixed waveform) in an excitation vector.
FIG. 1 is a block diagram showing an example of configuration of a fixed excitation codebook having a conventional pulse dispersiondispersion structure. dispersiondispersed pulse codebook 10 comprises pulse excitation codebook 11, dispersiondispersion vector convolution processor 12, and dispersiondispersion vector storage 13.
An excitation vector is output from pulse excitation codebook 11, and a dispersiondispersion vector, taken from dispersiondispersion vector storage 13, is convoluted with this pulse excitation vector in dispersion vector convolution processor 12, thereby generating a fixed excitation vector (noise excitation vector).
It is possible to improve the performance of the pulse excitation codebook at low bit rates such as below 4 kbit/s by conventional pulse dispersion.
Still, greater quality improvement (that is, further improving the quality of decoded speech) will be required in next-generation mobile telephone systems, and it is difficult to meet such demand with existing technologies.
For instance, simply increasing the patterns of dispersion vectors does not improve the quality of decoded speech, and increasing the patterns of dispersion vectors thus has the threat of increasing the capacity in a memory and making signal processing complex.