This invention relates to an apparatus and method for compressing a digital input signal to provide a compressed signal for storing in a medium or for transmission, and to an apparatus for complementarily expanding the compressed signal to provide a digital output signal.
There are a variety methods for compressing audio or voice signals. These methods include, for example, sub-band coding (SBC) in which audio signals etc. are divided on the frequency axis into plural frequency bands for quantizing. In adaptive transform coding (ATC), signals on the time axis are converted into signals on the frequency axis by an orthogonal transform to provide plural spectral coefficients that are then quantized. In adaptive bit allocation (APC-AB), SBC is combined with adaptive predictive coding (APC). Signals on the time axis are divided into plural frequency bands, the band signals are converted into base band signals, and plural orders of linear predictive analyses are performed to provide predictive coding.
In sub-band coding, for example, after the signals are divided into plural frequency bands, the signals of each band are converted into signals on the frequency axis by an orthogonal transform, after which quantizing is carried out for each band. In effecting the orthogonal transform, the input audio signals may be grouped into blocks at an interval of a predetermined unit of time, and a discrete cosine transform (DCT) is carried out for each block to convert the signal on the time axis into a signal on the frequency axis. In carrying out the division into plural frequency bands, certain characteristics of the human auditory sense may be taken into account. Thus, the audio signal may be divided into plural frequency bands, for example, into 25 bands having a bandwidth that increases with increasing frequency. Such bands are known as critical bands. In sub-band coding, the number of quantizing bits accorded to each frequency band is dynamically or adaptively changed, to raise the amount of data compression while maintaining the number of bits per unit time, i.e., the bit rate, constant.
For example, when quantizing the DCT coefficients from the DCT processing operation using bit allocation, the DCT coefficients in each frequency band resulting from the DCT processing operation carried out on each block are quantized using a dynamically-allocated number of bits.
To provide a greater degree of compression, techniques are used that take advantage of the masking effect, which takes into account certain characteristics of the human auditory sense. The masking effect is a phenomenon in which certain signals are masked, and hence rendered inaudible, by other signals. Thus noise below the masking level is allowable. The masking effect may be taken into account so that fewer quantizing bits are allocated to signal components below the allowable noise level, which reduces the bit rate.
If, with the above compression techniques, the input audio signal is divided into plural frequency ranges, and an orthogonal transform, such as a DCT, is carried out in each frequency range, that is, if frequency analyses are performed for each frequency range, the signal in each frequency range is divided into frames at an interval of predetermined unit of time, and a orthogonal transform is effected for each frame for each frequency range.
Alternatively, the spectral coefficients (e.g., DCT coefficients) produced by the orthogonal transform are quantized and the number of quantizing bits is allocated on a frame-by-frame basis.
The input audio signal is not necessarily static and substantially free from fluctuations in level. The signal can behave dynamically in many ways. For example, the signal dynamics may change transiently, such that the peak level of the signal changes significantly within a frame. For example, a signal representing the sound of a percussion instrument can change in this way.
If an audio signal that changes from static to transient or vice versa is processed solely by orthogonally transforming a complete frame, and the resulting spectral coefficients are quantized, the quantizing may not be suited to the signal dynamics, so that the sound quality perceived by the listener after the compressed signal has been expanded and reproduced may not be optimum.
When quantizing is performed using the allowable noise level, the number of bits allocated for quantizing is determined based on the ratio (or difference) of the energy in the frequency band and the allowable noise level corresponding to the masking level resulting from the energy in the frequency band.
However, among audio signals, there are signals having the character of a single tone. Such signals are said to have high tonality. If a signal has high tonality, quantizing bit allocation based on the energy in the frequency bands cannot be calculated accurately. That is, the energy within a given frequency band may not change between when the signal is highly tonal and when it is not. In such a case, it is not desirable to base the quantizing bit allocation on the band energy despite the fact that the characteristics of the data are different between the frequency bands. Because an accurate bit allocation cannot be made for the high tonality signals, the sound quality is reduced. That is, despite the fact that a large number of bits are required to quantize a high tonality signal, the previously known techniques are unable to allocate the required number of bits to these signals if they calculate the number of bits based on the band energy. This leads to a deterioration in the signal quality.
In view of the above-depicted state of the art, it is an object of the present invention to provide an apparatus for compressing a digital input signal in which compression more adaptive to the properties or characteristics of the input audio signal may be achieved, and in which the compressed signal, after expansion and reproduction, is better adapted to the human auditory sense.
It is another object of the present invention to provide an apparatus for compressing a digital input signal in which satisfactory bit allocation may be achieved even with high tonality signals to improve the sound quality.