1. Field of the Invention
This invention relates to a vector quantization method for comparing an input vector to code vectors stored in a codebook for outputting an index of the optimum code vector, and a speech encoding method and apparatus for splitting the input speech signal in terms of a pre-set encoding unit, such as a block or a frame, for performing encoding inclusive of vector quantization from one encoding unit to another.
2. Description of the Related Art
There has hitherto been known a technique for grouping plural input data into a vector for representation as a code or an index when digitizing audio or video signals and encoding the digitized signals by way of data compression (vector quantization).
In this vector quantization, representative patterns of a variety of input vectors are previously determined by learning and codes (indices) are given the patterns for storage in a codebook. The input vectors are compared to the patterns of the codebook (code vectors) by way of pattern matching for outputting a code of a pattern exhibiting highest similarity or correlation. This similarity or correlation is found by calculating the distortion measure or error energy between the input vector and the respective code vectors. It is noted that the smaller the distortion or error, the higher is the similarity or correlation.
There have hitherto been known a variety of encoding methods for encoding an audio signal (inclusive of speech and acoustic signals) for signal compression by exploiting statistic properties of the signals in the time domain and in the frequency domain and psychoacoustic characteristics of the human ear. The encoding method may roughly be classified into time-domain encoding, frequency domain encoding and analysis/synthesis encoding.
Examples of the high-efficiency encoding of speech signals include sinusoidal analytic encoding, such as harmonic encoding or multi-band excitation (MBE) encoding, sub-band coding (SBC), linear predictive coding (LPC), discrete cosine transform (DCT), modified DCT (MDCT) and fast Fourier transform (FFT).
In such high-efficiency encoding of speech signals, the above-mentioned vector quantization is used for parameters such as resulting spectral components of the harmonics.
Meanwhile, in harmonics encoding of speech signals, the number of spectral components of the harmonics in a pre-set frequency range varies with the pitch, such that, for the effective frequency range of up to 3400 kHz, the number of spectral components of the harmonics vary in a range of from 8 to 63 depending on pitch changes of female and male speech. Therefore, if the amplitudes of these spectral components of the harmonics are grouped into vectors, a variable-dimension vector is produced, which cannot be directly vector quantized without difficulties. Thus, the present Assignee has proposed in Japanese Laid-Open Patent 6-51800 to convert the variable-dimension vector into a pre-set fixed-dimensional vector prior to vector quantization.
This converts the number of amplitude data of the spectral components of the harmonics into a pre-set number, such as 44, of data, by way of data number conversion, and subsequently proceeds to vector quantization of the pre-set fixed-dimensional vector.
In vector quantizing the fixed-dimensional vector subsequent to data number conversion or variable/fixed dimensional conversion, the code vector resulting from codebook retrieval (codebook search) cannot necessarily lead to optimum minimization of the distortion or error between it and the original variable-dimension vector (spectral components of the harmonics).
On the other hand, if the number of patterns stored in the codebook, that is code vectors, is large, or in the case of the multi-stage vector quantizer made up of a combination of plural codebooks, the number of retrieving operations (searching operations) for code vectors is increased, thus increasing the processing volume. In particular, if plural codebooks are used in combination with each other, the processing for assessing similarity of the number of times of multiplication of the number of the code vectors of the respective codebooks is required, thus significantly increasing the processing volume for codebook search.