Many audio encoding technologies for encoding an audio signal to a small data size and then reproducing the audio signal from the coded bitstream are known. The international ISO/IEC 13818-7 (MPEG-2 AAC) standard in particular is known as a superior method enabling high audio quality playback with a small code size. This AAC coding method is also used in the more recent ISO/IEC 14496-3 (MPEG-4 Audio) system.
Audio coding methods such as AAC convert a discrete audio signal from the time domain to a signal in the frequency domain by sampling the time-domain signal at specific time intervals, splitting the converted frequency information into plural frequency bands, and then encoding the signal by quantizing each of the frequency bands based on an appropriate data distribution. For decoding, the frequency information is recreated from the code stream, and the playback sound is obtained by converting the frequency information to a time domain signal. If the amount of information supplied for encoding is small (such as in low bitrate encoding), the data size allocated to each of the segmented frequency bands in the coding process decreases, and some frequency bands may as a result contain no information. In this case the decoding process produces playback audio with no sound in the frequency component of the frequency band containing no information.
In general, because sensitivity to sound with a frequency above approximately 10 kHz is lower than to sound at lower frequencies, high frequency component data is generally dropped to provide narrowband audio playback if the audio coding scheme distributes information by a process based on human auditory perception.
If data is supplied at a bitrate of approximately 96 kbps, even the AAC method can code a 44.1 kHz stereo signal to an approximately 16 kHz band, but if data is encoded with data supplied at half this rate, i.e., 48 kbps, the bandwidth that can be quantified and coded while maintaining sound quality is reduced to at most approximately 10 kHz. In addition to being narrowband, playback sound coded with a low 48 Kbps bitrate also sounds cloudy.
A method enabling wideband playback by adding a small amount of additional information to a code stream for narrowband audio playback is described, for example, in the Digital Radio Mondiale (DRM) System Specification (ETSI TS 101 980) published by the European Telecommunication Standards Institute (ETSI). Similar technology known as SBR (spectral band replication) is described, for example, in AES (Audio Engineering Society) convention papers 5553, 5559, 5560 (112th Convention, 2002 May 10–13, Munich, Germany).
FIG. 2 is a schematic block diagram of an example of a decoder for band expansion using SBR. Input bitstream 206 is separated by the bitstream demultiplexer 201 into low frequency component information 207, high frequency component information 208, and sine wave-adding information 209. The low frequency component information 207 is, for example, information encoded using the MPEG-4 AAC or other coding method, and is decoded by the low-band decoder 202 whereby a time signal representing the low frequency component is generated. This time signal representing the low frequency component is separated into multiple (M) subbands by analysis filter bank 203 and input to high frequency signal generator 204.
The high frequency signal generator 204 compensates for the high frequency component lost due to bandwidth limiting by copying the low frequency subband signal representing the low frequency component to a high frequency subband. The high frequency component information 208 input to the high frequency signal generator 204 contains gain information for the compensated high frequency subband so that gain is adjusted for each generated high frequency subband.
An additional signal generator 211 generates injection signal 212 whereby a gain-controlled sine wave is added to each high frequency subband. The high frequency subband signal generated by the high frequency signal generator 204 is then input with the low frequency subband signal to the synthesis filter bank 205 for band synthesis, and output signal 210 is generated. The subband count on the synthesis filter bank side does not need to be the same as the number of subbands on the analysis filter bank side. For example, if in FIG. 2 N=2M, the sampling frequency of the output signal will be twice the sampling frequency of the time signal input to the analysis filter bank.
In this configuration the information contained in the high frequency component information 208 or sine wave-adding information 209 relates only to gain control, and the amount of required information is therefore very small compared with the low frequency component information 207, which also contains spectral information. This method is therefore suited to encoding a wideband signal at a low bitrate.
The synthesis filter bank 205 in FIG. 2 is composed of filters that take both real number input and imaginary number input for each subband, and perform a-complex-valued calculation.
The decoder configured as above for band expansion has two filters, the analysis filter bank and synthesis filter bank, performing complex-valued calculations, and decoding requires many calculations. A problem when the decoder is built for LSI devices, for example, is that power consumption increases and the playback time that is possible with a given power supply capacity decreases. Because the signals that we hear in the output from the synthesis filter bank are real-number signals, the synthesis filter bank may be configured with real number filter banks in order to reduce the calculations. While this reduces the number of calculations, if a sine wave is added using the same method as when the synthesis filter bank performs complex-valued calculations, a pure sine wave is not actually added and the intended result is not achieved in the reproduced audio.
The present invention is therefore directed to solving these problems of the prior art, and provides a decoding apparatus and method for a band expansion system operating with few calculations by using a real-valued calculation filter bank whereby the intended audio playback is achieved by adding slight change to an added sine wave generation signal such as would be inserted to a complex-valued calculation filter bank.