1. Field of the Invention
The present invention relates to a vector search method for obtaining an optimal sound source vector in vector quantization in compressing to code an audio signal and an acoustic signal.
2. Description of the Prior Art
Various coding methods are known for compressing an audio signal and an acoustic signal by utilizing statistic features in the time region and frequency band as well as the hearing sense characteristics. These coding methods can be divided into a time region coding, a frequency region coding, an analysis-synthesis coding, and the like.
For an effective coding method for compressing to encode an audio signal and the like, there are known a sine wave analysis coding such as harmonic coding and multiband excitation (MBE) coding as well as sub-band coding (SBC), linear predictive coding (LPC), discrete cosine transform (DCT), modified DCT (MDCT), fast Fourier transform (FFT), and the like.
When coding an audio signal, it is possible to predict a present sample value from a past sample value, utilizing the fact that there is a correlation between adjacent sample values. Adaptive predictive coding (APC) utilizes this characteristic and carries out a coding of a difference between a predicted value and an input signal, i.e., a prediction residue.
In this adaptive prediction coding, an input signal is fetched in a coding unit in which an audio signal can be regarded as almost stationary, for example, in a frame unit of 20 ms, and a linear prediction is carried out according to a prediction coefficient obtained by the linear prediction coding (LPC), so as to obtain a difference between the predicted value and the input signal. This difference is quantized and multiplexed with the prediction coefficient and the quantization step width as auxiliary information, so as to be transmitted in a frame unit.
Next, explanation will be given on code excited linear prediction (CELP) coding as a representative predictive coding method.
The CELP coding uses a noise dictionary called a codebook from which an optimal noise is selected to express an input audio signal and its number (index) is transmitted. In the CELP coding, a closed loop using analysis by synthesis (AbS) is employed for vector quantization of a time axis waveform, thus coding a sound source parameter.
FIG. 1 is a block diagram showing a configuration of an essential portion of a coding apparatus for coding an audio signal by using the CELP. Hereinafter, explanation will be given on the CELP coding with reference to the configuration of this coding apparatus.
An audio signal supplied from an input terminal 10 is firstly subjected to the LPC (linear predictive coding) analysis in an LPC analyzer 20, and a prediction coefficient obtained is transmitted to a synthesis filter 30. Moreover, the prediction coefficient is also transmitted to a multiplexer 130.
In the synthesis filter 30, the prediction coefficient from the LPC analyzer 20 is synthesized with signed vectors supplied from an adaptive code book 40 and a noise codebook 60, which will be detailed later, through amplifiers 50 and 70 and an adder 80.
An adder 90 determines a difference between the audio signal supplied from the input terminal 10 and a prediction value from the synthesis filter 30, which is transmitted to a hearing sense weighting block 100.
In the hearing sense weighting block 100, the difference obtained in the adder 90 is weighted, considering the characteristics of the hearing sense of a human. An error calculator 110 searches a signed vector to minimize a distortion of the difference weighted by the hearing sense, i.e., a difference between the prediction value from the synthesis filter 30 and the input audio signal, and gains of the amplifiers 50 and 70. The result of this search is transmitted as an index to the adaptive codebook 40, the noise codebook 60, and a gain codebook 120 as well as to the multiplexer 130 so as to be transmitted as a transmission path sign from an output terminal 140.
Thus, an optimal signed vector to express the input audio signal is selected from the adaptive codebook 40 and the noise codebook 60, and the optimal gain is determined for synthesizing them. It should be noted that the aforementioned synthesizing can be carried out after the hearing-sense weighting of the audio signal supplied from the input terminal 10, and signed vectors stored in the codebooks may be hearing-sense weighted.
Next, explanation will be given on the aforementioned adaptive codebook 40, the noise codebook 60, and the gain codebook 120.
In the CELP coding, a sound source vector for expressing an input audio signal is formed as a linear sum of a signed vector stored in the adaptive codebook 40 and a signed vector stored in the noise codebook 60. Here, the indexes of the respective codebooks used to express the sound source vector minimizing the hearing-sense weighted difference from the input signal vector are determined by calculating the output vector of the synthesis filter 30 for all the signed vectors stored and calculating errors in the error calculator 110.
Moreover, the gain of the adaptive codebook in the amplifier 50 and the gain of the noise codebook in the amplifier 70 are also coded by way of a similar search.
The noise codebook 60 normally contains a series of vectors of the Gaussian noise with dispersion 1 as the codebook vectors powered by the number of bits. And normally, a combination of the codebook vectors is selected so as to minimize the distortion of the sound source vector obtained by adding an appropriate gain to these codebook vectors.
The quantization distortion when quantizing the selected codebook vectors can be reduced by increasing the number of dimensions of the codebook. For example, the codebook used is in 40 dimensions and 2 to the power of 9 (the number of bits), i.e., 512 terms.
By using this CELP coding, it is possible to obtain a comparatively high compression ratio and a preferable sound quality. However, the use of a codebook of a large number of dimensions requires a large calculation amount in the synthesis filter and a large memory amount in the codebook, which makes difficult a real-time processing. If a high sound quality is to be assured, a great delay is caused. Moreover, there is another problem that only a one bit error in the code brings about a completely different vector reproduced. That is, such a coding is weak for the sign error.
In order to improve the aforementioned problems of the CELP coding, vector sum excited linear prediction (VSELP) coding is employed. Hereinafter, this VSELP coding will be explained with reference to FIG. 2 and FIG. 3.
FIG. 2 is a block diagram showing a configuration of a noise codebook used in a coding apparatus for coding an audio signal by way of the VSELP.
The VSELP coding employs a noise codebook 260 consisting of a plurality of predetermined basic vectors. Each of the number M of basic vectors stored in the noise codebook 260 is multiplied by a factor +1 or −1 to reverse the value according to the index decoded with a code additional section 270-1 to 270-M by a decoder 210. The M basic vectors multiplied by the factor +1 or −1 are combined with one another in an adder 280 to create 2M noise signed vectors.
As a result, by carrying out a convolution calculation for the M basic vectors and addition and subtraction thereof, it is possible to obtain a convolution calculation result for all the noise signed vectors. Moreover, as only the M basic vectors should be stored in the noise codebook 260, it is possible to reduce the memory amount. Also, it is possible to enhance the durability for a sign error because the 2M noise signed vectors created have a redundant configuration which can be expressed by addition and subtraction of the basic vectors.
FIG. 3 is a block diagram showing a configuration of an essential portion of a VSELP coding apparatus having the aforementioned noise codebook. In this VSELP coding apparatus, the number of noise codebooks which is normally 512 in the ordinary CELP coding apparatus is reduced to 9, and each of the signed vectors (basic vectors) is added with a sign +1 or −1 by a sign adder 365, so that a linear sum of these is obtained in an adder 370, so as to create 29=512 noise signed vectors.
The main feature of the VSELP coding is as has been described above that a noise signed vector is formed as a linear sum of basic vectors and that the gain of the adaptive codebook and the gain of the noise codebook are vector-quantized at once.
The basic configuration of such a VSELP coding is a coding method of analysis by way of synthesis, i.e., carrying out a linear prediction synthesis of a pitch frequency component and a noise component as the excitation sources. That is, a waveform is selected in vector unit from an adaptive codebook 340 which depends on a pitch frequency of an input audio signal and a noise codebook 360 for carrying out a linear prediction synthesis, so as to select a signed vector and a gain which minimize the difference from the waveform of the input audio signal.
In the VSELP coding, a signed vector from the adaptive codebook expressing the pitch component of an input audio signal and a signed vector from the noise codebook expressing the noise component of the input audio signal are both vector-quantized, so as to simultaneously obtain two optimal parameters in combination.
In this process, as the basic vector has only the freedom of being added by +1 or −1 and the vector of the adaptive codebook is not orthogonal to the basic vector, the coding efficiency is lowered if the CELP procedure is employed to successively determine the vector of the adaptive codebook and the gain of the noise signed vector. To cope with this, in the VSELP, the basic vector sign is determined according to a procedure as follows.
Firstly, the pitch frequency of the input audio signal is searched to determine a signed vector of the adaptive codebook. Next, the noise basic vector is projected to a space orthogonal to the signed vector of the adaptive codebook and an inner product with the input vector is calculated, so as to determine the signed vector of the noise codebook.
Next, according to the two signed vectors determined, the codebook is searched to determine a combination of a gain β and a gain γ which minimizes the difference between the vector synthesized and the input audio signal. For quantization of the two gains, a pair of two parameters equally converted is used. Here, the β corresponds to a long-term prediction gain coefficient and the γ corresponds to a scalar gain of the signed vector.
Although the calculation amount for the codebook search in the VSELP coding is lower than the calculation amount in the CELP coding, it is desired to further improve the processing speed, further reducing the delay.