In recent years, a speech encoder based on A-b-S vector quantization, such as a code-excited linear prediction (CELP) encoder, has been drawing attention in the fields of LAN systems, digital mobile radio systems, etc., as a promising speech encoder capable of compressing speech signal information without degrading its quality. In such a vector quantization speech encoder (hereinafter simply called the encoder), predictive weighting is applied to each code vector in a code book to reproduce a signal, and an error power between the reproduced signal and the input speech signal is evaluated to determine a number (index) for a code vector with the smallest error prior to transmission to the receiving end.
The encoder based on such an A-b-S vector quantization system performs linear predictive filtering on each of the speech source signal vectors according to about 1,000 patterns stored in the code book, and searches the about 1,000 patterns for the one pattern that minimizes the error between a reproduced signal and the input speech signal to be encoded.
Since the encoder is required to ensure the instantaneousness of voice communication, the above search process must be performed in real time. This means that the search process must be performed repeatedly at very short time intervals, for example, at 5 ms intervals, for the duration of voice communication.
However, as will be described in detail, the search process involves complex mathematical operations, such as filtering and correlation calculations, and the amount of calculation required for these mathematical operations will be enormous, for example, in the order of hundreds of megaoperations per second (Mops). To handle such operations, a number of chips will be required even if the fastest digital signal processors (DSPs) currently available are used. In portable telephone applications, for example, this will present a problem as it will make it difficult to reduce the equipment size and power consumption.
To overcome the above problem, the present applicant proposed, in Japanese Patent Application No. 3-127669 (Japanese Patent Unexamined Publication No. 4-352200), a speech encoding system using a tree-structure code book wherein instead of storing code vectors themselves as in previous systems, a code book, in which delta vectors representing differences between signal vectors are stored, is used, and these delta vectors are sequentially added and subtracted to generate code vectors according to a tree structure.
According to this system, the memory capacity required to store the code book can be reduced drastically; furthermore, since the filtering and correlation calculations, which were previously performed on each code vector, are performed on the delta vectors and the results are sequentially added and subtracted, a drastic reduction in the amount of calculation can be achieved.
In this system, however, the code vectors are generated as a linear combination of a small number of delta vectors that serve as fundamental vectors; therefore, the generated code vectors do not have components other than the delta vector components. More specifically, in a space where the vectors to be encoded are distributed (usually, 40- to 64-dimensional space), the code vectors can only be mapped in a subspace having a dimension corresponding at most to the number of delta vectors (usually, 8 to 10).
Accordingly, the tree-structure delta code book has had the problem that the quantization characteristic degrades as compared with the conventional code book free from structural constraints even if the fundamental vectors (delta vectors) are well designed on the basis of the statistic distribution of the speech signal to be encoded.
Noting that when the linear predictive filtering operation is performed on each code vector to evaluate the distance, amplification is not achieved uniformly for all vector components but is achieved with a certain bias, and that the contribution each delta vector makes to code vectors in the tree-structure delta code book can be changed by changing the order of the delta vectors, the present applicant proposed, in Japanese Patent Application No. 3-515016, a method of improving the characteristic by using a tree-structure code book wherein each time the coefficient of the linear predictive filter is determined, a filtering operation is performed on each delta vector and the resulting power (the length of the vector) is compared, as a result of which the delta vectors are reordered in order of decreasing power.
However, with this method also, code vectors are generated from a limited number of delta vectors, as with the previous method, so that there is a limit to improving the characteristic. A further improvement in the characteristic is therefore demanded.
Another challenge for the speech encoder based on A-b-S vector quantization is to realize variable bit rate encoding. Variable bit rate encoding is an encoding scheme capable of varying the bit rate such that the encoding bit rate is adaptively varied according to situations such as the remaining capacity of the transmission path, significance of the speech source, etc., to achieve a greater encoding efficiency as a whole.
If the vector quantization system is to be applied to variable bit rate voice encoding, it is necessary to prepare code books each containing patterns corresponding to each transmission rate, and perform encoding by switching the code book according to the desired transmission rate.
In the case of conventional code books each constructed from a simple arrangement of code vectors, N.times.M words of memory corresponding to the product of the vector dimension (N) and the number of patterns (M) would be necessary to store each code book. Since the number of patterns M is proportional to the n-th power of 2 where n is the bit length of an index of the code vector, the problem is that an enormous amount of memory will be required in order to increase the variable range of the transmission rate or to control the transmission rate in smaller increments.
Also, in variable bit rate transmission, there are cases in which the rate of the transmission signals has to be reduced according to a request from the transmission network side even after encoding. In such cases, the decoder has to reproduce the speech signal from bit-dropped information, i.e. information with some bits dropped from the encoded information generated by the encoder.
For scalar quantization, which is inferior in efficiency to vector quantization, various techniques have so far been devised to cope with bit drop situations, for example, by performing control so that bits are dropped from the LSB side in increasing order of significance, or by constructing a high bit rate quantizer in such a manner as to contain the quantization levels of a low bit rate quantizer (embedded encoding).
However, in the case of the vector quantization system that uses conventional code books constructed from a simple arrangement of code vectors, since no structuring schemes are employed in the construction of the code books, there are no differences in significance among index bits for a code vector (whether the dropped bit is the LSB or MSB, the result will be the same in that an entirely different vector is called), and the same techniques as employed for scalar quantization cannot be used. The resulting problem is that a bit drop situation will cause a significant degradation in sound quality.