In recent years, a variety of audio compression methods have been developed. MPEG-2 Advanced Audio Coding (AAC) is one of such compression methods, and is defined in detail in “ISO/IEC 13818-7 (MPEG-2 Advanced Audio Coding, AAC)”.
First, the conventional encoding and decoding procedures will be described below using FIG. 1. FIG. 1 is a block diagram showing a configuration of an encoding device 300 and a decoding device 400 according to the conventional MPEG-2 AAC method. The encoding device 300 is a device that compresses and encodes an inputted audio signal based on MPEG-2 AAC, and includes an audio signal input unit 310, a transforming unit 320, a quantizing unit 331, an encoding unit 332 and a stream output unit 340.
The audio signal input unit 310 divides digital audio data that is an input signal into every contiguous 1,024 samples at a sampling frequency of 44.1 kHz, for instance. This encoding unit of 1,024 samples is called a “frame”.
The transforming unit 320 performs Modified Discrete Cosine Transform (MDCT) on the sample data in the time domain divided by the audio signal input unit 310 into spectral data in the frequency domain. This spectral data of 1,024 samples transformed at this point in time is then divided into a plurality of groups, and each of the groups is set so as to include the spectral data of one or more samples. Also, each of the groups simulates a critical band of human hearing, and is called a “scale factor band”.
The quantizing unit 331 quantizes the spectral data produced from the transforming unit 320 into a predetermined number of bits. According to MPEG-2 AAC, the quantizing unit 331 quantizes the spectral data in the scale factor band using one normalizing factor for every scale factor band. This normalizing factor is called a scale factor. Also, the result of quantizing each spectral data with each scale factor is called a “quantized value”. The encoding unit 332 encodes the data quantized by the quantizing unit 331 and the spectral data quantized using the scale factor in accordance with Huffman coding. The data quantized by the quantizing unit 331 is a scale factor. Before doing so, the encoding unit 332 calculates a differential in values of two scale factors of every two contiguous scale factor bands in one frame, and encodes the differential and the scale factor of the first scale factor band in accordance with Huffman coding.
The stream output unit 340 transforms the encoding signal produced from the encoding unit 332 into an MPEG-2 AAC bit stream and outputs it. The bit stream outputted from the encoding device 300 is transmitted to the decoding device 400 via a transmission medium, or recorded on a recording medium, such as an optical disc including a compact disc (CD) and a digital versatile disc (DVD), a semiconductor, and a hard disk.
The decoding device 400 is a device that decodes the bit stream encoded by the encoding device 300, and includes a stream input unit 410, a decoding unit 421, a dequantizing unit 422, an inverse-transforming unit 430 and an audio signal output unit 440.
The stream input unit 410 receives the bit stream encoded by the encoding device 300 via a transmission medium or via a recording medium, and reads out the encoded signal from the received bit stream. The decoding unit 421 then decodes the read-out encoded signal to produce a quantized value.
The dequantizing unit 422 dequantizes the quantized value decoded by the decoding unit 421. In MPEG-2 AAC, the decoding unit 421 decodes the data encoded in accordance with Huffman coding. The inverse-transforming unit 430 transforms the spectral data in the frequency domain produced by the dequantizing unit 422 into the sample data in the time domain. In MPEG-2 AAC, this is performed by Inverse Modified Discrete Cosine Transform (IMDCT). The audio signal output unit 440 combines the sample data in the time domain produced by the inverse-transforming unit 430 in sequence, and outputs the sets of sample data as digital audio data.
In actual MPEG-2 AAC encoding, other techniques are additionally used, which include gain control, Temporal Noise shaping (TNS), a psychoacoustic model, M/S (Mid/Side) stereo, intensity stereo, prediction, and a bit reservoir.
The quality of the audio data encoded according to the above-mentioned method can be measured, for instance, by a reproduction band of the audio data after encoding. When an input signal is sampled at a 44.1-kHz sampling frequency, for instance, a reproduction band of this signal is 22.05 kHz. When the audio signal with the 22.05-kHz reproduction band or a wider reproduction band close to 22.05 kHz is encoded into encoded audio data without degradation, and the data amount is fitted to the available transmission rate, then this audio data can be reproduced as high-quality sound. The width of a reproduction band, however, affects the number of spectral data values, which in turn affects the data amount for transmission. For instance, when an input signal is sampled at the sampling frequency of 44.1 kHz, spectral data generated from this signal is composed of 1,024 samples, which has the 22.05-kHz reproduction band. In order to secure the 22.05-kHz reproduction band, all the 1,024 samples of the spectral data need to be transmitted.
It is not realistic, however, to transmit as many as 1,024 samples of the spectral data via a low-rate transmission channel of, for instance, cell phones. This is to say, when all the spectral data with a wide reproduction band is transmitted at such a low transmission rate while the size of the entire spectral data is adjusted for the low transmission rate, a data size assigned to each frequency band becomes extremely small. This intensifies effect of quantization noise, so that sound quality deteriorates through encoding.
In order to prevent such degradation, efficient audio signal transmission is achieved in many of audio signal encoding methods including MPEG-2 AAC by assigning weights to values of the spectral data and not transmitting low-weighted values. As for the reproduction band, with this method, sufficient data size is assigned to spectral data in a lower frequency band, which is important for human hearing, to enhance its encoding accuracy, while spectral data in a higher frequency band is regarded as less important and is unlikely to be transmitted.
Although such techniques are used in MPEG-2 AAC, audio encoding technology that achieves higher-quality reproduction and more efficient compression is now required. In other words, there is an increasing demand for technology of transmitting an audio signal in a higher frequency band as well as a lower frequency band at a low transmission rate.
The object of the present invention is to provide an encoding device and a decoding device that can realize encoding and decoding of an audio signal to reproduce high-quality sound without substantially increasing an amount of encoded data.