1. Field of the Invention
One or embodiments of the present invention relate to audio decoding, and more particularly, to a surround audio decoding method, medium, and system for selectively decoding an audio signal to a stereo signal or a multi-channel signal.
2. Description of the Related Art
In general, multi-channel audio coding is classified into waveform multi-channel audio coding and parametric multi-channel audio coding. The waveform multi-channel audio decoding includes MPEG-2 MC audio coding, AAC MC audio coding, BSAC/AVS MC audio coding, etc., and typically receives 5 encoded channel signals and outputs 5 decoded channel signals. The parametric multi-channel audio decoding typically includes MPEG surround coding, and a decoding terminal would receive 1 or 2 input encoded channel signals and outputs 6 or 8 decoded multi-channel signals.
According to an MPEG surround specification, an input encoded signal can be decoded as a multi-channel signal through a first 5-1-5 tree structure, illustrated in FIG. 1A, and a second 5-1-5 tree structure, illustrated in FIG. 1B. Here, the tree structures receive a down-mixed mono signal, i.e., a signal that has been encoded from multi-channel signals and output as a mono signal, and up-mixes the mono signal to multi-channel signals of a Front Left (FL) channel, a Front Right (FR) channel, a Center (C) channel, a Low Frequency Enhancement (LFE) channel, a Back Left (BL) channel, and a Back Right (BR) channel, using combinations of 1-to-2 (OTT) modules. Here, the up-mixing of the mono signal through the stages of OTT modules can be accomplished with previously generated spatial information of Channel Level Differences (CLDs) and/or Inter-Channel Correlations (ICCs), with the CLD being information about an
energy ratio or difference between predetermined channels in multi-channels, and with the ICC being information about correlation or coherence corresponding to a time/frequency tile of input signals. With respective CLDs and ICCs, each staged OTT can up-mix a single input signal to respective output signals through each staged OTT.
However, due to increases in use of mobile applications, rather than the multi-channel signals, a stereo channel structure is more frequently used than the multi-channel structure. Thus, there is a problem in that the conventional tree structures do not provide an easy computational simplified technique for generating just the stereo channels, i.e., all channels must typically be decoded by performing the entire staged decoding of the input down-mixed mono signal. For example, referring to FIG. 1A, in the first 5-1-5 tree structure, the corresponding OTT0 module outputs a signal that includes information for a FL channel signal, a FR channel signal, a C channel signal, and a LFE channel signal, and a signal that includes information for a BL channel signal and a BR channel signal. Meanwhile, referring to FIG. 1B, in the second 5-1-5 tree structure, the corresponding OTT0 module outputs a signal that includes information for the FL channel signal, the BL channel signal, the FR channel signal, and the BR channel signal and a signal that includes information for the C channel signal and the LFE channel signal.
For this reason, in these 5-1-5 tree structures, the signals output from the corresponding OTT0 modules cannot be suitably used for generation of a left and right channel stereo signal. Rather, additional decoding through the remaining OTT modules stages must be performed to ultimately decode the left and right channels, requiring additional computations and resources.