1. Field of the Invention
This invention relates to a vector quantization method in which an input vector is compared to code vectors stored in a codebook for outputting an index of an optimum one of the code vectors. The present invention also relates to a speech encoding method and apparatus in which an input speech signal is divided in terms of a pre-set encoding unit, such as a block or a frame, and encoding processing including vector quantization is carried out on the encoding unit basis.
2. Description of the Related Art
There has hitherto been known vector quantization in which, for digitizing and compression-encoding audio or video signals, a plurality of input data are grouped together into a vector for representation as a sole code (index).
In such vector quantization, representative patterns of a variety of input vectors are previously determined by, for example, learning, and given codes or indices, which are then stored in a codebook. The input vector is then compared to the respective patterns (code vectors) by way of pattern matching for outputting the code of the pattern bearing the strongest similarity or correlation. This similarity or correlation is found by calculating the distortion measure or an error energy between the input vector and the respective code vectors and becomes higher as the distortion or error becomes smaller.
There have hitherto been known a variety of encoding methods exploiting statistic properties in the time domain or frequency domain and psychoacoustic properties of the human being in signal compression. This encoding method is roughly classified into encoding in the time domain, encoding in the frequency domain and analysis-by-synthesis encoding.
Among examples of high-efficiency encoding of a speech signal, there are sinusoidal wave analytic encoding, such as a harmonic encoding, a sub-band coding (SBC), linear predictive coding (LPC), discrete cosine transform (DCT), modified DCT (MDCT) or fast Fourier transform (FFT).
In high-efficiency encoding of the speech signals, the above-mentioned vector quantization is used for parameters such as spectral components of the harmonics.
Meanwhile, if the number of the patterns stored in the codebook, that is the number of the code vectors, is large, or if the vector quantizer is of a multi-stage configuration made up of plural codebooks, combined together, the number of times of code vector search operations for pattern matching is increased to increase the processing volume. In particular, if plural codebooks are combined together, processing for finding the similarity of the number of multiplications of the number of code vectors in the codebooks becomes necessary, thereby increasing the codebook search processing volume significantly.
It is therefore an object of the present invention to provide a vector quantization method, a speech encoding method and a speech encoding apparatus capable of suppressing the codebook search processing volume.
For accomplishing the above object, the present invention provides a vector quantization method including a step of finding the degree of similarity between an input vector to be vector quantized and all code vectors stored in a codebook by approximation for pre-selecting plural code vectors bearing a high degree of similarity and a step of ultimately selecting one of the plural pre-selected code vectors that minimizes an error with respect to the input vector.
By executing ultimate selection after the pre-selection, a smaller number of candidate code vectors are selected by pre-selection involving simplified processing and subjected to ultimate selection of high precision to reduce the processing volume for codebook searching.
The codebook is constituted by plural codebooks from each of which can be selected plural code vectors representing an optimum combination. The degree of similarity may be an inner product of the input vector and the code vector, optionally divided by a norm or a weighted norm of each code vector.
The present invention also provides a speech encoding method in which an input speech signal or short-term prediction residuals thereof are analyzed by sinusoidal analysis to find spectral components of the harmonics and in which parameters derived from the encoding-unit-based spectral components of the harmonics, as the input vector, are vector quantized for encoding. In the vector quantization, the degree of similarity between the input vector and all code vectors stored in a codebook is found by approximation for pre-selecting a smaller plural number of the code vectors having a high degree of similarity, and one of these pre-selected code vectors which minimizes an error with respect to the input vector is selected ultimately.
The degree of similarity may be an optionally weighted inner product between the input vector and the code vector optionally divided by a norm or a weighted norm of each code vector. For weighting the norm, a weight having a concentrated energy towards the low frequency range and a decreasing energy towards the high frequency range may be used. Thus, the degree of similarity can be found by dividing the weighted inner product of the code vector by the weighted code vector norm.
The present invention is also directed to a speech encoding device for carrying out the speech encoding method.