1. Field of the Invention
The present invention relates to audio signal processing, and more particularly, to a method and apparatus for generating a bitstream of an audio signal, in which an audio signal can be easily extended to a multichannel audio signal, the processing speed of an audio signal can be improved, and channel signals of an audio signal can be processed simultaneously, and an audio encoding/decoding method and apparatus using the method and apparatus.
2. Description of Related Art
FIG. 1 is a block diagram of a conventional audio encoder. Referring to FIG. 1, the conventional audio encoder includes a time/frequency mapping unit 100, a psychoacoustic modeling unit 110, a data processing unit 120, a quantizing unit 130, and a bitstream generating unit 140.
The time/frequency mapping unit 100 converts an audio signal in a time domain into signals in a frequency domain. A difference perceived by humans between the characteristics of a signal is not so great in the time domain, but the converted signals in the frequency domain vary from perceivable signals to unperceivable signals in each frequency band according to a human psychoacoustic model. Thus, compression efficiency can be improved by changing the number of bits assigned to each frequency band.
The psychoacoustic modeling unit 110 calculates a masking threshold for each frequency band using a masking phenomenon of the converted signals in the frequency domain.
By using the masking threshold for each frequency band input from the psychoacoustic modeling unit 110, the data processing unit 120 performs signal processing to improve encoding efficiency while minimizing a sound quality change that can be perceived by human. The data processing unit 120 uses a signal processing method for improving encoding efficiency, such as time-domain noise simulation, intensity stereo processing, perceptual-noise substitution, or mid/side (M/S) stereo processing.
The quantizing unit 130 performs scalar-quantization on frequency signals in each frequency band so that the magnitude of quantization noise in each frequency band is less than a corresponding masking threshold. Thus, humans cannot perceive the quantization noise even though the quantization noise is included in the audio signal. The bitstream generating unit 140 generates a bitstream to fit it into a predetermined data structure by combining the quantized audio signal of the encoder and information about the encoding.
When the audio signal to be encoded is a multichannel audio signal, it is generally encoded in predetermined units of encoding, instead of in channel units. The predetermined unit of coding means at least one channel signal that is simultaneously encoded.
For example, when an audio signal includes 5 channel signals, i.e., a stereo channel signal, a mono channel signal, a center channel signal, a surround left channel signal, and a surround right channel signal, the predetermined units of encoding are the stereo channel signal and the mono channel signal that are encoded together, the center channel signal, and the surround left channel signal and the surround right channel signal that are encoded together. Since two channel signals have high redundancy, encoding efficiency can be improved by encoding the two channel signals at the same time.
Conventional audio devices are classified into stereo players and a multichannel players. The stereo player is developed to also provide a mono playback function. The multichannel player is developed to also provide a stereo playback function. A bitstream extension method for the application of a data structure for generating bitstreams of mono/stereo audio signals to multichannel audio signals is provided in ISO/IEC 13818-3.
FIG. 2 illustrates a first example of a data structure of an extensible bitstream for a multichannel audio signal used in ISO/IEC 13818-3. As illustrated in FIG. 2, to support compatibility with ISO/IEC 11172-3, multichannel audio data is inserted into ancillary data 1 of an ISO/IEC 11172-3 bitstream. Thus, when a bitstream of a multichannel audio signal is generated using the data structure illustrated in FIG. 2, it is necessary to decompose and analyze mono/stereo data and determine whether multichannel audio data exists based on whether a syncword for multichannel extension is included in an ancillary data portion.
FIG. 3 illustrates a second example of a data structure of an extensible bitstream for a multichannel audio signal used in ISO/IEC 13818-3. The data structure illustrated in FIG. 3 is configured to further include additional multichannel data in addition to a bitstream having a size compatible with MPEG-1. Thus, to check if the frame length of a bitstream is extended, it is determined whether multichannel audio data exists based on whether a syncword is included in an ancillary data portion of an MPEG-1 part and then it is determined whether an additional bitstream exists as an extension part using an ancillary data pointer.
When a multichannel audio signal is encoded/decoded using the conventional bitstream data structure, it is difficult to determine whether an audio signal included in a bitstream is a multichannel signal including other channel signals in addition to stereo/mono channel signals. As a result, the audio signal cannot be efficiently processed according to the user's demand or the performance of an audio player. Moreover, since the maximum frame length is predetermined, the total frame length cannot be efficiently used.