1. Field of the Invention
The present invention relates to signal processing apparatuses and methods, and transmission media and recording media, and in particular to, a signal processing apparatus and method in which, with so-called "highly efficient encoding" used to divide a band into a plurality of bands, encoded digital data are treated as an input signal, and in a decoding phase the divided bands are combined to form one before it is output. The present invention also relates to a transmission medium and a recording medium, both used in the signal processing apparatus and method.
2. Description of the Related Art
There are various techniques for performing the highly efficient encoding of an audio or sound signal. For example, they include subband coding (SBC) as a deblocked frequency-band dividing which divides a time-domain sound signal into a plurality of frequency band signals using a digital filter without forming blocks of the sound signal, and blocked frequency-band dividing, namely, so-called "transform encoding" in which the (spectral) transform of a time-domain signal into frequency-axial signals is performed to generate a plurality of divided frequency bands and each frequency band is encoded. In addition, a highly efficient encoding technique, as a combination of the above-described subband coding and transform encoding, has been proposed. In this technique, for example, the above-described subband coding is used to divide a band, and the spectral transform of each band signal is performed to form frequency-axial signals. Each spectral-transformed band is encoded.
The above-described digital filter is, for example, a polyphase quadrature filter (PQF). An equal bandwidth filter-dividing technique is described in ICASSP 83, BOSTON, Polyphase Quadrature filters--A New Subband Coding Technique, Joseph H. Rothwiler.
The above-described spectral transform is, for example, a type of spectral transform in which an input sound signal is converted into blocks in units of predetermined time (frames), and the discrete Fourier transform (DFT), discrete cosine transform (DCT), modified discrete cosine transform (MDCT), etc. of each block is performed to transform a time axis into a frequency axis. MDCT is described in ICASSP 1987, Subband/Transform Coding, Using Filter Bank Designs Based on Time Domain Aiasing Cancellation, J. P. Princen, A. B. Bradley Univ. of Survey Royal Melbourne Inst. of Tech.
By quantizing signals divided so as to correspond to bands by a digital filter and spectral transform, bands in which quantization noise is generated can be controlled, and using characteristics such as effects of masking enables encoding which is highly efficient for the sense of hearing. In addition, for example, before quantization is performed, by performing normalization using the maximum absolute value of signal components in each band, more highly efficient encoding can be performed.
Concerning a frequency dividing width for quantizing the frequency components of each divided frequency band, band division can be performed considering human auditory characteristics. In other words, there may be a case in which a sound signal is divided into a plurality of bands (e.g., 25 bands) by using a bandwidth in which the bandwidth widens as the band rises. At this time, predetermined bit provision for each band, or adaptive bit allocation is performed.
For example, when coefficient data obtained from MDCT are encoded by bit allocation, MDCT coefficient data for each band, obtained by performing the MDCT of each band, are encoded using the number of adaptively allocated bits. The following two techniques are known as bit allocation.
A technique for bit allocation based on the magnitude of a signal for each band is disclosed in IEEE Transactions of Acoustics, Speech, and Signal Processing, vol. ASSP-25, No. 4, August 1977. According to this technique, quantization noise is flat and noise energy is minimized. However, since auditory effects of masking are not used, a sense of a listener who actually hears noise is not optimum.
A technique for performing fixed bit allocation by using auditory masking to obtain a signal-to-noise ratio for each band is disclosed in ICASSP 1980, The critical band coder,--digital encoding of the perceptual requirements of the auditory system M. A. Kransner MIT. According to this technique, bit allocation is fixed even when measurement of characteristics is performed using a sine-wave input. Thus, the characteristic values are not so preferable.
In order to solve these problems, a highly efficient encoder has been proposed. According to the highly efficient encoder, all bits usable for bit allocation are divided for use into bits for predetermined fixed bit allocation pattern for each small block, and bits for bit allocation dependent on the magnitude of a signal for each block. By setting the divisional ratio to be dependent on a signal related to an input signal, the divisional ratio of the fixed bit allocation pattern is increased in proportion to the smoothness of the signal spectrum.
According to the highly efficient encoder, in the case where energy concentrates on a certain spectral component as in a sine-wave input, the allocation of many bits to a block including the spectral component can remarkably improve the whole signal-to-noise ratio. In general, human auditory sense is extremely sensitive to signals having a steep spectral component. Therefore, the use of this encoding technique to improve the signal-to-noise ratio is effective not only in simply improving measured values but also in improving auditory sound quality.
In addition to this technique, many other techniques for bit allocation have been proposed. According to these techniques, an auditory model is detailed, and improving the encoder ability enables highly efficient encoding in the sense of hearing.
To realize the decoding of highly-efficient-encoded data, it is necessary to carry out, a considerable number of operations compared with the case where sampled values of a signal are simply used. In particular, when a signal to be treated includes a high-band component and is distributed in wide bands, the number of spectral components to be treated increases, and a sampling frequency increases. Thus, the number of necessary operations per unit time is increased. Accordingly, for real-time reproduction in which, while a predetermined amount of sound signals is being decoded, the decoded waveform signals are reproduced, a processor for this decoding needs to have an extremely high arithmetic ability. If a processor having a low arithmetic ability is used, decoding cannot follow real-time reproduction, which cannot maintain the continuity of the reproduced sound.
In addition, in the case where the above-described decoding is performed in a multitasking personal computer or the like, a processor takes time to process tasks besides the decoding, and time for the processor to perform the decoding per unit time is reduced to hindering the real-time reproduction, similar to the above described case.