1. Field of The Invention
The present invention relates to an apparatus for compressing audio data to be used for data compression in an audio data compression/decompression system for compressing the audio data for transmission or recording and decompressing the audio data for reproducing the transmitted or recorded data, and more particularly to a high efficiency encoding apparatus for compressing the audio data at a high compression factor and a high efficiency.
2. Description of The Related Art
Prior art references related to the present invention are:
Document 1: JP-A-4-250722 PA1 Document 2: JP-A-5-19798 PA1 Document 3: JP-A-5-37395 PA1 Document 4: ISO/IEC 11172-3, 1993 Information Technology-Coding of moving picture and associated audio for digital storage media at up to 1.5 Mbit/s, Annex C, p.p.66, 70-72
Various methods for efficiently coding (data compressing) an audio signal are known such as those disclosed in the above Documents 2 and 4. One example is a band division coding system (sub-band coding system) which divides a digital audio signal into a plurality of frequency bands for coding.
In the band division coding system, an input digital audio signal is sampled at a predetermined sampling period and the following band division coding is applied to the audio signal sampled in each sampling period. First, the sampled audio signal is transformed into audio signals of a plurality of frequency bands by a filter bank circuit and the signals contained in the respective frequency bands are subjected to floating by a floating process circuit. The floating process is a process to modify levels of signals contained in each frequency band by using a common coefficient to raise precision in a subsequent quantization process. For example, a process to normalize the signals contained in each frequency band based on a maximum absolute value therein may be used as the floating process. The common coefficient used in the modification in the floating process, or the signal used as a reference of the normalization when the normalization is used as the floating process is referred to as a floating coefficient.
The input audio signal is applied to a signal characteristic calculation circuit for determining its signal characteristic. An allocated bit-number, i.e. the number of bits to be used for representing the audio signals contained in each frequency band, is determined based on the signal characteristic and a predetermined number of bits per unit time i.e. a predetermined bit rate, which is separately inputted, to be used for representing the compressed audio signal.
A quantization circuit provided for each frequency band quantizes the audio signal, after the floating process, contained in the frequency band based on the allocated bit-number as determined for the frequency band thereby to output encoded data. In this manner, the encoded data of the audio signal contained in each frequency band is produced.
The signal characteristic calculation circuit and the adaptive bit allocation circuit have been known as disclosed in, for example, the above Documents 1 and 3. To fully understand the present invention, some explanation is added below. First, a circuit configuration of a prior art adaptive bit allocation circuit is explained with reference to FIG. 7. The adaptive bit allocation circuit allocates the number of bits to be used to represent the compressed audio signal to each band so as to enhance a signal-to-noise ratio (S/N ratio) of the audio signal contained in each band or to reduce the noise level.
As shown in FIG. 7, the adaptive bit allocation circuit includes a memory circuit 1, a maximum value detection circuit 2, a bit distribution circuit 4 and a signal-to-noise ratio modification circuit 5. The signal characteristic determined by the signal characteristic calculation circuit, or for example, a signal representing a magnitude of a signal energy of the audio signal contained in each frequency band is applied to a terminal 61 and stored in the memory circuit 1.
The maximum value detection circuit 2 detects a maximum of the energy values of the audio signals contained in all the bands stored in the memory circuit 1 to determine the band which contains the maximum. The bit distribution circuit 4 allocates a unit bit to the band containing the maximum. Namely, it increments the number of bits to be used to represent the audio signal contained in the band containing the maximum by the unit bit, for example, one bit. Each band is initially allocated with "0", for example, as the number of bits to represent the audio signal contained therein. Then, the signal-to-noise ratio modification circuit 5 calculates a modified value corresponding to the enhancement of the signal-to-noise ratio by the increment of the unit bit and modifies the energy value, as stored in the memory circuit, of the audio signal contained in the band containing the maximum by the modified value. The modified value corresponding to the enhancement of the signal-to-noise ratio (S/N ratio) is a modified value based on the decrease of a relative noise due to the increment of the number of bits to represent the audio signal by one bit and it is calculated by a predetermined formula. A specific method for determining the modified value is well known and the explanation thereof is omitted.
In the bit distribution circuit 2, the total number of bits distributed to the audio signals contained in each band is checked, and if it is within a range of the bit rate indicated by the bit rate signal applied to the input terminal 11, the detection of the band containing the maximum is further repeated and the distribution of the unit bit is continued. In this manner, the bit length to be used to represent the audio signal contained in each band is determined by the total number of bits distributed to the band and it is outputted from the terminal 12.
The signal characteristic determined by the signal characteristic calculation circuit may be the magnitude of the energy for each band. Alternatively, an allowable noise spectrum for each band may be used by utilizing an audible masking effect. A prior art configuration therefor is explained with reference to FIG. 8.
The masking effect refers to a phenomenon in which certain sound is masked by other sound by the human auditory characteristic so that it is not audible by the human. The masking effect includes a temporal masking effect in which the masking occurs by signals which are close on a time axis and a simultaneous time masking effect in which the masking occurs by signals which are close on a frequency axis.
Even if a noise is contained in the masked portion, the noise is not audible by the masking effect. Thus, the noise within the range which is masked in the actual audio signal is considered as being permissible.
As shown in FIG. 8, the digital input data is applied through an input terminal 48 to the energy calculation circuit 51 for calculating the energy for each band. In the energy calculation circuit 51, the data is divided into a plurality of frequency bands in the same manner as in the filter bank circuit and the energy for each band is calculated based on the audio signal contained in each band by, for example, calculating the root-mean-square value of the amplitude.
A peak amplitude may be used instead of the energy. Alternatively, the signal representing the floating coefficient 46 may be used for this purpose.
Then, in the subtraction circuit 56, an absolute threshold, which corresponds to the minimum human auditory characteristic and is output from a minimum auditory characteristic table circuit 52, is subtracted from the signal energy of each band outputted from the energy calculation circuit 51.
In a masking effect modification circuit 57 in a stage following the subtraction circuit 56, the masking effect is modified for the permissible noise spectrum. The masking effect is modified by subtracting the permissible noise spectrum from the signal energy. The resulting characteristic signal is outputted to the adaptive bit allocation circuit through an output terminal 61.
FIG. 6 shows an example of the energy of the band, the absolute threshold and the masking threshold. In FIG. 6, the band is divided into 18. The energy at a certain time of each band calculated by the energy calculation circuit of FIG. 8 has a distribution pattern as shown by "E" in FIG. 6.
The absolute threshold which represents the human auditory characteristic has a distribution pattern which is high at a high frequency and also at a low frequency as shown by AS. The subtraction circuit 56 produces a difference between the energy E and the absolute threshold AS. The masking threshold by the masking effect is calculated by the masking characteristic calculation circuit 53 and has a distribution pattern as shown by MS in FIG. 6.
The masking effect appears at an area which is closer to a peak of the spectrum. By taking its affect into consideration, the masking effect modification circuit 57 of FIG. 8 modifies the permissible noise spectrum AS by MS and the bit allocation is carried out by utilizing the resulting permissible noise level AS+MS. The circuit parts constituting the signal characteristic calculation circuit of FIG. 8 are known and detailed description thereof is omitted.
In the prior art audio data compression apparatus, the amounts of calculation in the filter bank process, the floating process, the quantization process and the signal characteristic calculation process are substantially constant independent of the bit rate.
However, in the distribution of the bits to each band, the number of bits to be handled is larger and the amount of calculation is larger as the bit rate higher. As a result, the larger the bit rate is, the longer the processing time of the compression of the entire audio data compression apparatus is.
Further, the method of calculating the permissible noise spectrum by using the signal characteristic calculation circuit of FIG. 8 involves a problem such that although a high quality of sound is attained by utilizing the human auditory characteristic, the calculation of the permissible noise spectrum requires a large amount of calculation independent of the bit rate.