1. Field of the Invention
The present invention relates to an audio signal coding apparatus and an audio signal decoding apparatus capable of reducing the number of bits contained in a coded wideband audio signal.
2. Description of the Related Art
A speech signal compressing/coding method such as AMR (Adaptive Multi-Rate) defines that a coding bit rate can be changed frame by frame based on the detected signal activity.
In the AMR method, in order to reduce transmission power, it is detected whether the activity of an input signal to be coded is voice or not in units of coding, that is, frame by frame (VAD control), and when the input signal is determined as being voice, the input signal is transmitted in the form of a normal audio coded frame, whereas when the input signal is determined not to be voice, only the basic information of the frame is transmitted discontinuously (DTX (Discontinuous Transmission) control) in the form of a comfort noise frame. However, because the DTX control is executed in frames, when this method is applied to a wideband signal such as an audio signal, the DTX control is performed for the whole band to determine whether the activity is present in the input signal.
FIGS. 8A and 8B are views showing transition of the output bit rate, for example, when the DTX control of the AMR method is applied to a wideband audio signal. FIG. 8A indicates power of an audio signal in each frequency band in units of frames on the time axis. The frequency bands without the activity are illustrated by hatching. For instance, a frame F1 contains a plurality of frequency bands all having activity. A frame F2 contains a plurality of frequency bands all having no activity. A frame F3 and a frame F4 contain a plurality of frequency bands having no activity in part of the frequency bands. In this case, only the frame F2 has no frequency band with activity in the whole bandwidth and is recognized as a frame to be subject to the DTX control. Thus, the output bit rate of the frame F2 can be reduced to a low rate through a discontinuous transmission (DTX control) as a comfort noise frame. However, since the frames F3 and F4 contain frequency bands with activity, the frames F3 and F4 are not recognized to be subject to the DTX control. That is, since frames F3 and F4 do not deal with non-audio signal of the AMR method in spite of the presence of the frequency bands without the activity, the discontinuous transmission (DTX control) is not performed.
In addition, according to the MPEG2 audio standards, the AAC (Advanced Audio Coding) method adopting the time-to-frequency transform coding is used.
FIGS. 9A and 9B are views used to describe a bit rate in the AAC method. FIG. 9A is the same as FIG. 8A. Although the function of performing a discontinuous transmission is not incorporated in the AAC method, the AAC method is a variable length frame method by which the number of bits per frame can be changed according to the signal characteristic of each frame, and an instantaneous coding rate for each frame is variable (corresponding to a solid line in FIG. 9B) . The number of bits per frame is determined by taking into account the characteristic of a signal and the buffer model (a bit reservoir serving as a buffer to manage a cumulative difference between the number of bits used in frames in the past and an average number of bits based on a target rate) in reference to the number of bits based on the target rate set from the outside (corresponding to a dotted line in FIG. 9B), and the coding rate is controlled to reach the target rate on average.
For example, in the case of the frame F2, which contains frequency bands without the activity (only a slight number of bits is required), even when the number of bits is reduced for this frame,, as is indicated by a hollow arrow, a surplus number of bits is used for another frame. Also, in the case of the frames F3 and F4, which contain frequency bands without the activity in part of the frequency bands, even when the number of bits is reduced for such a frequency band or the frame containing such a frequency band with no activity, as is indicated by a hollow arrow, bits are allocated to the other frequency bands or to another frame. Hence, as is shown in FIG. 9B, even when there are many signals that require only a slight number of bits (with fewer activities), the resulting number of bits is the number of bits based on the pre-set target rate and a total coding rate is not reduced. This method is therefore by no means efficient.
A variable rate coding method for controlling the coding bit rate frame by frame is disclosed in Jpn. Pat. Appln. KOKAI Publication No. 3-191618. In this coding method, variable rate control is performed for an SNR, whichmeans sound quality, to be constant. In addition, a signal sequence, such as an audio, is divided into plural frequency bands, and the number of bits is controlled for each frequency band on the basis of signal power in each frequency band. It should be noted, however, that because the presence or absence of an audio is determined in the whole frequency bands and a sum of coding quantities of the entire frame is controlled, the control is not performed for each frequency band. This method is therefore a technique that is the same as the AMR method.
The coding method in the related art has a problem that the rate control cannot be performed finely and bands cannot be utilized efficiently.