1. Field of the Invention
The present invention relates to an audio decoder for expanding audio data compressed by using a data compressing technique and, more particularly, to an audio decoder for expanding, for example, compressed audio data that is transmitted through a transmission path or reproduced from a recording medium.
2. Description of the Related Art
Hitherto, various methods for highly efficient encoding of an audio signal have been known. For example, there is a method whereby an audio signal is converted by using a time base to frequency base conversion to convert a signal in a time domain into a signal in a frequency domain and a data compression adapted to a human auditory sense in each frequency band is executed. As such a method for the time-base to frequency-base conversion, for example, a method using a sub-band filter or an MDCT (Modified Discrete Cosine Transform) can be mentioned.
Outlines of the sub-band filter encoding method and MDCT encoding method have been disclosed in, for example, "Advances in Speech Signal Processing", edited by Furui & Sondhi, published by MARCEL DEKKAR Co., Ltd. (New York), pages 109-140, 1991. An audio encoding method using the time-base to frequency-base conversion based on the MDCT encoding method will now be described hereinbelow.
FIG. 1 shows an example of a construction of an encoder using the MDCT encoding method. Encoding processes in the encoder will be described hereinbelow.
A digital audio signal, inputted through an input terminal 51, is converted from a signal in a time domain to a signal in a frequency domain at every certain time interval by an MDCT circuit 41. A data length corresponding to the time interval is called a conversion block length.
Audio data in the frequency domain outputted from the MDCT circuit 41 is quantized (variable length encoded) by a quantizing circuit 42. After that, header information such as sampling frequency or the like is added to the quantized audio data by a multiplexing circuit 43, and the resultant data is outputted as encoded data from an output terminal 52.
A time-base to frequency-base converting process which is executed in the MDCT circuit 41 is described by the following equation (1). ##EQU1## where, x(k): input signal
w(k): window function PA1 y(m): signal subjected to the MDCT PA1 N: conversion block length (samples) PA1 n=N/2: phase term PA1 y(m): signal subjected to the MDCT process PA1 N: conversion block length PA1 n=N/2: phase term PA1 yn!: signal of one channel subjected to the down mixing PA1 .alpha.ch!: coefficient for the down mixing corresponding to the channel ch PA1 M: the number of target channels to be subjected to the down mixing
The window function w(k) is used to prevent the generation of a discontinuous signal at a boundary portion of two adjacent conversion blocks. An example of a shape of such a signal is shown in FIG. 2. In the equation (1), since the number of input signals x(k) to the MDCT circuit 41 equals N, and m is set to a value within a range from 0 to (N/2-1), the number of signals y(m) which were subjected to the MDCT becomes equal to N/2.
In the MDCT process, when the next block is converted after the present block was converted, the conversion is executed at a point deviated by a distance of N/2 samples from the present block to the next block. Namely, the MDCT is continuously executed to each conversion block in such a form that adjacent two blocks overlap by a distance of N/2 samples. This is intended to prevent the occurrence of a discontinuous signal at a boundary portion of conversion blocks. Such a situation is shown in FIG. 3. FIG. 3 shows an example when the value of conversion block length N is equal to 512 samples.
In the example of FIG. 3, the audio data has been divided into sub-blocks including 256 samples for explanation. First, a window function is applied to cover a sub-block 0 and a sub-block 1 and the MDCT is executed. Subsequently, the window is deviated by a distance of 256 samples, the window function is applied to cover the sub-block 1 and a sub-block 2, and the MDCT is executed. In the example of FIG. 3 as mentioned above, the MDCT of each block including 512 samples is continuously performed by overlapping 256 samples.
When the conversion block length in the MDCT circuit 41 is constant, the above-mentioned process is executed. However, a method of improving encoding efficiency by changing the conversion block length in accordance with signal characteristics of the input signal is also known. An example of such an MDCT encoding method of the variable block length, is the MPEG audio layer III in the International Standard Organization ISO IEC 11172-3. can be mentioned.
According to such an MDCT method of variable block length, the conversion block length is changed with the lapse of time in accordance with the characteristics of the input signal. Namely, when the characteristics of the input signal are stationary, the encoding process is executed by using a long conversion block length. On the other hand, in the case where the characteristics of the input signal suddenly change when, for example, a pulse-like signal is inputted, the encoding process is performed by using a short conversion block length.
Although there are various methods of changing the conversion block length, a method is often used in which when the long conversion block length is equal to an integer N, the short conversion block length is set to N/L (L=2, 3, . . . ).
As to the encoder of FIG. 1, in a conversion block length deciding circuit 44, the conversion block length is decided in accordance with the characteristics of the input audio signal and the decided conversion block length information is supplied to the MDCT circuit 41 and multiplexing circuit 43. The MDCT circuit 41 executes the MDCT process while changing the conversion block length with the lapse of time in accordance with the conversion block length information that is inputted from the conversion block length deciding circuit 44.
When the conversion block length information decided by the conversion block length deciding circuit 44 is sent to the multiplexing circuit 43, the conversion block length information and the quantization data are multiplexed by the multiplexing circuit 43 and the resultant data is outputted as encoded data from the output terminal 52.
As a digital audio signal which is inputted from the input terminal 51, audio signals of a plurality of channels can be treated. For example, when the user inputs audio signals of five channels which are used in a movie, the processes in the MDCT circuit 41, quantizing circuit 42, and conversion block length deciding circuit 44, are respectively independently executed for the five channels. After the data of five channels is multiplexed by the multiplexing circuit 43, the resultant data is outputted as encoded data.
In this case, by performing the encoding process by using a correlation among the channels, a sound quality at the same bit rate can be improved. As such, a processing method, a differential method, and a coupling method are known. The differential method is mainly used for sub-band information (data in each conversion block) of a low frequency band, and by obtaining the sum and the difference between the sub-band information of two channels, the information is concentratedly arranged to one channel upon encoding, thereby enabling an information amount to be reduced.
On the other hand, the coupling method is mainly used for sub-band information of a high frequency band, and by sharing a value of a real sample among a plurality of channels, an information amount can be reduced. In the high frequency band, power or sound pressure is relatively small and even if one sub-band information is shared among a plurality of channels, no problem occurs with the sense of hearing. Therefore, if there is a similar portion in the sub-band information of the high frequency band in each channel, by sharing the information of such a similar portion, the information amount is reduced.
FIG. 4 shows an example of an audio decoder according to the conventional MDCT method.
A conventional decoding technique for decoding audio data of a plurality of channels will now be described hereinbelow with reference to FIG. 4.
In FIG. 4, encoded audio data is inputted to a demultiplexing circuit 31 through an input terminal 21. In the demultiplexing circuit 31, the input encoded data is separated into multiplexed audio data of a plurality of channels and conversion block length information.
The audio data of each channel outputted by the demultiplexing circuit 31 is subjected to an inverse quantizing process for every channel by an inverse quantizing circuit 32. The processing result is inputted to an IMDCT (Inverse MDCT) circuit 33. The inverse quantizing process denotes that a bit length of each sample data which was variable length encoded is obtained and each sample data is identified. The conversion block length information separated by the demultiplexing circuit 31 is also inputted to the IMDCT circuit 33. The IMDCT circuit 33 executes an IMDCT process for every channel on the basis of the inputted conversion block length information.
A frequency base to time base converting process which is executed by the IMDCT circuit 33 is described by the following equation (2). ##EQU2## where, x(k): signal subjected to the IMDCT process
The number of signals x(k) subjected to the IMDCT process is equal to N and the number of signals y(m) subjected to the MDCT processed is equal to N/2.
After the signals which had been subjected to the IMDCT process on the basis of the equation (2) were temporarily stored into a delay buffer 34, a window applying arithmetic operation is performed by a window applying operating circuit 35. The window applying operating circuit 35 applies a window function (an example is shown in FIG. 2) of the same shape as that upon MDCT process, further adds data in overlap portions between the former half portion of the present block and the latter half portion of the previous block to reconstruct an audio signal. This is because the data has been converted by overlapping every N/2 sample, when the MDCT process is executed and therefore, an aliasing occurs unless the addition is performed.
FIG. 5 shows a state of the overlap at that time. In the example, first, the portions where 256 samples overlap between block 0 and block 1 each having 512 samples are added and the audio signal of 256 samples is reconstructed. Subsequently, the portions where 256 samples overlap between the block 1 and block 2 are added and the audio signal of next 256 samples is reconstructed. In a manner similar to the above, the audio signal is reconstructed for 256 samples.
When the audio data of a plurality of channels has been encoded and a speaker for generating an audio sound on the decoding side has only the channels of the number smaller than the number of encoded channels, there is a case where a down mixing process is executed. The down mixing process is a process for producing audio data of the channels of a smaller number from the audio data of a plurality of channels. An example of such a process is described as the following equation (3). ##EQU3## where, xch!n!: input signal corresponding to the channel ch
For example, although there is a case where the audio signal that is used in a movie or the like is constructed by encoding the audio data of five channels, in an audio apparatus for home use, audio signals of only two channels usually can be generated. In such a case, the down mixing process shown in the equation (3) is executed twice by a down mixing circuit 36, thereby producing the audio signals of two channels from the audio signals of five channels.
When the encoded audio data of all channels can be generated on the decoding side, there is no need to perform the down mixing process. In such a case, no process is executed in the down mixing circuit 36 and the audio data which was windowed by the window applying operating circuit 35 is outputted as it is from an output terminal 22.
FIG. 6 is a hardware constructional diagram showing further in detail the conventional audio decoder shown in FIG. 4 with consideration of a memory capacity.
FIG. 6 shows an example in which audio data of five channels is treated. A memory capacity when the conversion block length is set to 512 points is also shown. It is now assumed that the five channels are a left channel Lch, a center channel Cch, a right channel Rch, a rear left channel LSch, and a rear right channel RSch.
In FIG. 6, the audio data of each channel which was inversely quantized through the processes in the demultiplexing circuit 31 and inverse quantizing circuit 32 is stored into an inverse quantizing buffer 37. An arithmetic operation of the equation (2) is executed every channel by the IMDCT circuit (frequency base to time base converting circuit) 33 for the inversely quantized data of each channel stored in the inverse quantizing buffer 37. The arithmetic operation result is stored into a time base information buffer 38.
The audio data of each channel stored in the time base information buffer 38 is supplied to the window applying operating circuit (adding and window applying circuit) 35. After the window applying arithmetic operation was executed in the window applying operating circuit 35, the data of the former half portion of the present block and the data of the latter half portion of the previous block stored in the delay buffer 34 are added so as to overlap.
The resultant data of the overlap addition from the window applying operating circuit 35 is stored into a PCM buffer 39. The data of the latter half portion of the present block is stored into the delay buffer 34 after completion of the window applying operation and is used for the overlap addition to the next block.
When the down mixing process is necessary, the audio data of each channel is read out from the PCM buffer 39 and the down mixing process shown in the equation (3) is executed by the down mixing circuit 36. The resultant data of the down mixing process is outputted through the output terminal 22.
As shown in FIG. 6, in the conventional audio decoder, it is necessary to provide the buffer memories such as inverse quantizing buffer 37, time base information buffer 38, delay buffer 34, and PCM buffer 39. A memory capacity of at least (256.times.5) words is necessary for each of the inverse quantizing buffer 37, time base information buffer 38, and delay buffer 34. A memory capacity of at least (256.times.10) words is necessary for the PCM buffer 39.
The reason why the memory capacity of (256.times.10) words is necessary for the PCM buffer 39 is as follows. Generally, in an audio equipment, it is required to output PCM data at a constant rate. To satisfy such a requirement, it is necessary to use double buffers one of which is for storing the data just after completion of the arithmetic operation by the window applying operating circuit 35 and the other of which is for outputting the data at a constant rate and it is necessary to perform pipeline operation of the double buffers. For this purpose, the memory capacity of (256.times.5.times.2) words is needed for the PCM buffer 39.
Namely, in the conventional audio decoder, the memory capacity of a total of 6400 words is necessary for the buffer memories of the inverse quantizing buffer 37, time base information buffer 38, delay buffer 34, and PCM buffer 39, and therefore, there is a problem such that a fairly large memory capacity is necessary.