Variable methods for high efficiency coding of an audio or speech signal are available. For example, subband coding (SBC) is a non-blocking frequency band dividing method wherein an audio signal or a like signal on the time base is not blocked but is divided into and coded in a plurality of frequency bands. Transform coding is a blocking frequency band dividing method wherein a signal on the time base is transformed (spectrum transformed) into another signal on the frequency base and divided into a plurality of frequency bands and then coded for each frequency band.
Also a method for high frequency coding which includes a combination of the subband coding and the transform coding described above is available. In this method, band division is first performed by subband coding, and then signals in each band are spectrum transformed into signals on the time base, whereafter coding is performed for the spectrum transformed signals for each band. For example, a quadrature mirror filter can be used as a filter for the band division mentioned above. The quadrature mirror filter is disclosed, in R. E. Crochiere, "Digital coding of speech in subbands", Bell Syst. Tech. J., Vol. 55, No. 8, 1976.
Meanwhile, a filter dividing method having subbands of equal bandwidths is disclosed in Joseph H. Rothweiler, "Polyphase Quadrature filters--A new subband coding technique", ICASSP 83, BOSTON. Here, as the spectrum transform mentioned above, a spectrum transform is performed on an input audio signal. The input audio signal is blocked with a predetermined unit time (frame). A discrete Fourier transform (DFT), discrete cosine transform (DCT) or modified discrete cosine transform (MDCT) is performed for each block to transform the input audio signal of time base into a frequency base signal. The MDCT is disclosed in J. P. Princen and A. B. Bradley, "Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation", Univ. of Surrey Royal Melbourne Inst. of Tech., ICASSP, 1987.
By quantizing signals divided into subbands by a filter or spectrum transform in this manner, subband quantization noise can be controlled. In addition higher efficiency coding can be achieved by making use of masking effects. Specifically normalization is performed for each band before quantization. If the normalization is based upon a maximum value from among absolute values of signal components in the band, higher efficiency coding can be anticipated.
Division of the frequency band is performed taking the auditory sense characteristic of the human being into consideration. In particular, an audio signal is frequency divided into a plurality of bands (for example, 25 bands) whose bandwidth increases as the frequency increases up to a high frequency band normally called the critical band. Further, coding is performed with predetermined bit allocation for the individual bands or with adaptive bit allocation for the individual bands.
For example, coefficient data obtained by the MDCT processing mentioned above are coded with an adaptive allocation bit number. The following two methods are known for the bit allocation.
The first method is disclosed in R. Zelinski and P. Noll, "Adaptive Transform Coding of Speech Signals", IEEE Transactions of Acoustics, Speech, and Signal processing, Vol. ASSP-25, No. 4, August 1977. Here, bit allocation is performed based on the magnitude of a signal for each band. With this method, flat quantization noise spectra are obtained and the noise energy is minimized. However, since the masking effects of an auditory sense are not utilized, the actual noise feeling perceived by a human being is not optimum.
The second method is disclosed in M. A. Kransner, "The critical band order-digital encoding of the perceptual requirements of the auditory system", ICASSP, MIT, 1980. The document recites a method performing fixed bit allocation wherein auditory sense masking is utilized to obtain a required signal to noise ratio for each band. With the method, however, even when a characteristic value is measured with a sine wave input, since the bit allocation is fixed, the characteristic value does not exhibit a very good value.
In order to solve these problems, a high efficiency coding apparatus has been proposed wherein all available bits are divided into bits which are used for a fixed bit allocation pattern determined in advance for each subblock and bits which are used for bit allocation which relies upon the magnitude of a signal of each block. The dividing ratio between them is determined based upon the input signal so that the number of bits allocated to the fixed bit pattern is increased as the spectra of the signals mentioned above become smoother.
With the apparatus described above, when energy is concentrated in a particular spectrum such as when a sine wave is inputted, a comparatively large number of bits are allocated to the block which includes the spectrum. Accordingly, the overall signal to noise ratio characteristic can be improved remarkably. Generally, since the auditory sense of the human being is very sensitive to a signal having a steep spectral component, improvement of signal to noise ratio by employment of such a method as described above also improves the sound quality perceived by the auditory sense.
Many other various methods have been proposed for bit allocation, and the model regarding the auditory sense has improved. An improved coding apparatus requires higher efficiency in terms of the auditory sense.
The inventors of the present invention have proposed, in PCT application No. PCT/JP94/00880 filed on May 31, 1994 and corresponding U.S. patent application Ser. No. 08/374,518, (now U.S. Pat. No. 5,717,821) a method wherein a tonal component which is particularly important in terms of the auditory sense is separated from a spectrum signal and is coded separately from other spectrum components. By this method, an audio signal or a like signal can be coded efficiently at a high compression ratio while causing little deterioration on the auditory sense.
Where the DFT or DCT mentioned above is used as a method of transforming a waveform signal into a spectrum, M independent real number data are obtained by transformation of a time block including M samples. In order to reduce connection distortion between time blocks, M1 samples are usually overlapped with each other between two adjacent blocks. Consequently, according to the DFT or DCT, when averaged, M real number data are quantized and coded for (M-M1) samples.
On the other hand, where the MDCT mentioned above is used as a method of transforming a waveform signal into a spectrum, M independent real number data are obtained from 2M samples from which N samples are overlapped with those of each of the opposite adjacent times. Consequently, according to the MDCT, when averaged, M real number data are quantized and coded for M samples. In a decoding apparatus, inverse transform is performed for each block from codes obtained using the MDCT. Waveform elements obtained by the inverse transform are added to each other in an interfering relationship with each other to regenerate a waveform signal.
Generally, by increasing the length of a time block for transformation, the frequency resolution of spectra increases and energy is concentrated upon a particular spectrum component. Where the MDCT is used, a long block is transformed with individual halves thereof overlapped with blocks on the opposite sides. The number of spectrum signals obtained does not exhibit an increase in number with respect to the original number of time samples. Therefore coding of a higher efficiency can be performed than where the DFT or DCT is used. Further, by providing a sufficiently large overlap length between adjacent blocks, the inter-block distortion of a waveform signal can be reduced.
In order to form an actual code train, quantization accuracy information and normalization coefficient information should be coded with a predetermined bit number for each band. The normalized and quantized spectrum signal should then be coded.
A method of coding a spectrum signal which uses variable length coding such as Huffman coding is known. The Huffman coding is disclosed, for example, in David A. Huffman, "A Method for the Construction of Minimum--Redundancy Codes", Proceedings of the I.R.E., September, 1952, pp.1,098-1,101.
Also another method is known which employs a multidimensional variable length code by which a plurality of spectrum signals are represented collectively by a single code. Generally, in a coding method which employs a multidimensional variable length code, as the order number of the code increases, the efficiency in coding in terms of the compression efficiency increases. However, since the scale of a code train table increases progressively as the order number increases, the method gives rise to a problem in practical use. Actually, an optimum order number conforming to an object is selected taking the compression efficiency and the scale of the code train table into consideration.
Generally, in an acoustic waveform signal, energy is frequently concentrated upon a basic frequency component and frequency components which are integral multiples of the basic frequency, that is, harmonic components. Since spectrum signals around the frequencies have very low levels compared with those of harmonic components, they are quantized into 0 at a relatively high probability. In order to code such signals efficiently, spectrum signals which are quantized to 0, and which are produced in a high probability, should be coded with the least information possible. Where one-dimensional variable length codes are used, even if each spectral signal is coded with one bit which is the smallest code length, N-bits are required for N spectra. Where a multidimensional variable length code of the order number N is used, since N spectra can be coded with one bit of the smallest code length, efficient coding can be performed for signals having such frequency components as described above.
However, while increasing the order number of codes acts considerably advantageously in terms of the compression efficiency, where practical use is taken into consideration, it is impossible to increase the order number indefinitely.
Normally, a code train table is prepared for the quantization accuracy set for each band. Where the quantization accuracy is low, since the number of values of spectrum signals which can be represented is small, even if the order number is increased, the scale of the code train table is not increased very much. However, where the quantization accuracy is high, the number of values of spectrum signals which can be represented is larger. Therefore, even if the order number is increased by only one, the scale of the code train table is increased remarkably.
This will be described in more detail using a concrete example. Now, it is assumed that an input signal is MDCT transformed to obtain such spectra as seen in FIG. 1. In FIG. 1, absolute values of spectra of the MDCT are converted in level into dB values and indicated on the axis of ordinate. The axis of abscissa indicates the frequency, and an input signal is transformed into 32 spectrum signals taking a predetermined time as a block. The spectrum signals are divided into 6 coded units denoted by [1] to [6] in FIG. 1. Each coded unit includes a plurality of spectra, and normalization and quantization are performed for each coded unit.
By varying the quantization accuracy for each coded unit depending upon the distribution of frequency components, coding which minimizes the deterioration in sound quality and is efficient in terms of the auditory sense can be performed. The quantization accuracy required for each coded unit can be obtained by calculating a minimum audible level or a masking level in a band corresponding to each coded unit based on an auditory sense model. Each normalized and quantized spectrum signal is transformed into a variable length code and is coded together with the quantization accuracy information and normalization information for each coded unit.
Table A illustrates a method of representation when quantization accuracy information is transmitted. Where a quantization accuracy information code is represented with 3 bits, eight different kinds of quantization accuracy information can be set. In the example illustrated, quantization is performed with one of 8 different step numbers of one step, 3 steps, 5 steps, 7 steps, 15 steps, 31 steps, 63 steps and 127 steps.
Here, quantization into one step signifies that spectrum signals in the coded unit are all quantized into the value of 0.
FIG. 2 is a diagrammatic view illustrating a variable length coding method which has ordinarily been performed. Spectrum signals are quantized with quantization accuracy information determined for each coded unit to obtain quantization spectra. Quantization spectra are transformed into a corresponding code train by referring to such a code train table as shown as table B.
Referring to FIG. 2, for the coded unit [1], for example, a code 011 is selected as the quantization accuracy information. Accordingly, quantization is performed with 7 step numbers as seen from Table B, and the values of the quantized spectrum signals are, in order from the lower frequency side, -1, 3, 0, 1. Here, it is to be noted that the level of each spectrum signal is represented by an absolute value.
If the spectrum signals are transformed into code trains using the code train table portion of the table B wherein the quantization accuracy information code is 011, then digit data of 101, 1110, 0, 100 are obtained, respectively. Thus, they have the code lengths of 3, 4, 1 and 3, respectively.
Meanwhile, for the coded unit [2], another code 010 is selected as the quantization accuracy information. In this instance, quantization is performed with five different step numbers as seen from Table B. As code trains which can be used here, five different digit data of 111, 101, 0, 100 and 110 have the smallest word length. This is because, for example, if code trains 10, 11, 0, 01 and 101 are used, then when 0101001 is received as a code train, it cannot be distinguished whether it is 0, 101, 0, 01 or 01, 0, 10, 01 or else 01, 01, 0, 01. Accordingly, it must be noticed that there are limitations to code trains which can be used.
Where the quantization accuracy information code is 010, the values of quantized spectrum signals are, in order from the lower frequency side, 0, -2, 1, 0. If they are transformed into code trains using the code train table portion of the table B wherein the quantization accuracy information code is 010, then the spectrum signal values are 0, 111, 100, 0, and they have code lengths of 1, 3, 3, 1, respectively.
As seen from Table B, code trains are prepared by a number equal to the quantization step number. Accordingly, for example, where the quantization accuracy information code is 111, 127 different code trains are prepared. Consequently, a total of 252 different code trains are prepared.
Similarly, for the coded unit [3], a further code 001 is selected as the quantization accuracy information, and quantization is performed with three different step numbers. Thus, the quantization spectra are 0, -1, 0, 0, the code trains are 0, 11, 0, 0, and the code lengths are 1, 2, 1, 1.
FIG. 3 illustrates a two-dimensional variable length coding method. The two-dimensional code train table as seen in Table C is used when the quantization accuracy information code is 001. The quantization spectra of each of coded units [1], [2] and [3] shown in FIG. 3 are collected two by two into a group which is transformed into one code train. Accordingly, quantized spectra 0, -1, 0, 0 of the coded unit [3] are collected as (0, -1) and (0, 0) and transformed into two code trains of 101, 0.
Where the quantized spectra 0, -1, 0, 0 of the coded unit [3] are coded into one-dimensional variable length codes as seen in FIG. 2, the required information amount is 1+2+1+1=5 bits. In contrast, if they are coded into two-dimensional variable length codes as seen in FIG. 3, then the information amount is 3+1=4 bits. Thus, it can be seen that use of two-dimensional variable length coding can allow coding with a reduced information amount.
As described above, by coding two quantized spectra into one variable length code in accordance with a code train table, the code length can be decreased compared with that where a one-dimensional variable length code is used.
However, if a plurality of (N) quantized spectra are collected into one group to obtain N-dimensional data and code the N-dimensional data into a variable length code then a very large scale code train table is required. This makes it difficult to put the coding apparatus into practical use.
For example, the quantized spectra of the coded unit [3] shown in FIG. 3 may be collected into one group. Accordingly 3.sup.4, (or 81) code trains are required.