Various methods for efficiently coding and decoding audio signals have been proposed. Especially for an audio signal having a frequency band exceeding 20 kHz such as a music signal, an MPEG audio method has been proposed in recent years. In the coding method represented by the MPEG method, a digital audio signal on the time axis is transformed to data on the frequency axis using orthogonal transform such as cosine transform, and data on the frequency axis are coded from auditively important one by using the auditive sensitivity characteristic of human beings, whereas auditively unimportant data and redundant data are not coded. In order to express an audio signal with a data quantity considerably smaller than the data quantity of the original digital signal, there is a coding method using a vector quantization method, such as TC-WVQ. The MPEG audio and the TC-WVQ are described in “ISO/IEC standard IS-11172-3” and “T. Moriya, H. Suga: An 8 Kbits transform coder for noisy channels, Proc. ICASSP 89, pp. 196-199”, respectively. Hereinafter, the structure of a conventional audio coding apparatus will be explained using FIG. 24. In FIG. 24, reference numeral 1601 denotes an FFT unit which frequency-transforms an input signal; 1602 denotes an adaptive bit allocation calculating unit which performs calculation of adaptive bit allocation by calculating a minimum audible limit and masking characteristic so that a specific band of the frequency-transformed input signal is coded; 1603 denotes a sub-band division unit which divides the input signal into plural bands; 1604 denotes a scale factor normalization unit which normalizes each component of the band divided in plural, using a scale factor; and 1605 denotes a scalar quantization unit which performs scalar quantization of the normalized output from the scale factor normalization unit 1604, according to the bit allocation from the adaptive bit allocation calculating unit 1602.
A description is given of the operation. An input signal is input to the FFT unit 1601 and the sub-band division unit 1603. In the FFT unit 1601, the input signal is subjected to frequency conversion, and the output is input to the adaptive bit allocation unit 1602. In the adaptive bit allocation unit 1602, how much data quantity is to be given to a specific band component is calculated on the basis of the minimum audible limit, which is defined according to the auditive characteristic of human beings, and the masking characteristic, and the data quantity allocation for each band is coded as an index.
On the other hand, in the sub-band division unit 1603, the input signal is divided into, for example, 32 bands, to be output. In the scale factor normalization unit 1604, for each band component obtained in the sub-band division unit 1603, normalization is carried out with a representative value. The normalized value is quantized as an index. In the scalar quantization unit 1605, on the basis of the bit allocation calculated by the adaptive bit allocation calculating unit 1602, the output from the scale factor normalization unit 1604 is scalar-quantized, and the quantized value is coded as an index IND2.
Meanwhile, various methods of efficiently coding an acoustic signal have been proposed. Especially in recent years, a signal having a frequency band of about 20 kHz, such as a music signal, is coded using the MPEG audio method or the like. In the methods represented by the MPEG method, a digital audio signal on the time axis is transformed to the frequency axis using an orthogonal transform, and data on the frequency axis are given data quantities, which a priority to auditively important one, while considering the auditive sensitivity characteristic of human beings. In order to express a signal having a data quantity considerably smaller than the data quantity of the original digital signal, employed is a coding method using a vector quantization method, such as TCWVQ (Transform Coding for Weighted Vector Quantization). The MPEG audio and the TCWVQ are described in “ISO/IEC Standard IS-11172-3” and “T. Moriya, H. Suga: An 8 Kbits Transform Coder for Noisy Channels, Proc. ICASSP 89, pages 196-199”, respectively.
In the conventional audio signal coding apparatus constructed as described above, it is general that the MPEG audio method is used so that coding is carried out with a data quantity of 64000 bits/sec for each channel. With a data quantity smaller than this, the reproducible frequency band width and the subjective quality of decoded audio signal are sometimes degraded considerably. The reason is as follows. As in the example shown in FIG. 24, the coded data are divided into three main parts, i.e., the bit allocation obtained by the adaptive bit allocation unit 1602, the band representative value obtained by the scale factor normalization unit 1604, and the quantized value obtained by the scalar quantization unit 1605. So, when the compression ratio is high, a sufficient data quantity is not allocated to the quantized value. Further, in the conventional audio signal coding apparatus, it is general that a coder and a decoder are constructed with the data quantity to be coded and the data quantity to be decoded being equal to each other. For example, in a method where a data quantity of 128000 bits/sec is coded, a data quantity of 128000 bits is decoded in the decoder.
However, in the conventional audio signal coding and decoding apparatuses, coding and decoding must be carried out with a fixed data quantity to obtain a satisfactory sound quality and, therefore, it is impossible to obtain a high-quality sound at a high compression ratio.
The present invention is made to solve the above-mentioned problems and has for its object to provide audio signal coding and decoding apparatuses and an audio signal coding and decoding method, in which a high quality and a broad reproduction frequency band are obtained even when coding and decoding are carried out with a small data quantity and, further, the data quantity in the coding and decoding can be variable, not fixed.
Furthermore, in the conventional audio signal coding apparatus, quantization is carried out by outputting a code index corresponding to a code that provides a minimum auditive distance between each code possessed by a code block and an audio feature vector. However, when the number of codes possessed by the code book is large, the calculation amount significantly increases when retrieving an optimum code. Further, when the data quantity possessed by the code book is large, a large quantity of memory is required when the coding apparatus is constructed by hardware, and this uneconomical. Further, on the receiving end, retrieval and memory quantity corresponding to the code indices are required.
The present invention is made to solve the above-mentioned problems and has for its object to provide an audio signal coding apparatus that reduces the number of times of code retrieval, and efficiently quantizes an audio signal with a code book having less number of codes, and an audio signal decoding apparatus that can decode the audio signal.
Furthermore, the present invention has for its object to provide audio signal coding and decoding apparatuses, and an audio signal coding and decoding method, that can significantly improve the quantization efficiency.