Spatial Audio Coding (SAC) is a technology for efficiently compressing multichannel audio signals while maintaining compatibility with a conventional mono or stereo audio system. The SAC technology relates to a method for presenting multichannel signals or independent audio object signals as downmixed mono or stereo signal and side information, which is also called a spatial parameter, and transmitting and recovering the multichannel signals or independent audio object signals. The SAC technology can transmit a high-quality multichannel signal at a very low bit rate.
According to a main strategy of the SAC technology, a spatial parameter of each band is estimated by analyzing the multichannel signal according to each sub-band, and the multichannel original signal is recovered based on a spatial parameter and a downmix signal. Therefore, the spatial parameter plays an important role in recovering the original signal and becomes a primary factor controlling sound quality of the audio signal played by the SAC technology. Binaural cue coding (BCC) is currently introduced as a representative SAC technology. A spatial parameter according to the BCC includes inter-channel level difference (ICLD), inter-channel time difference (ICTD) and inter-channel coherence (ICC).
In Moving Picture Experts Group (MPEG), standardization of a technology for maintaining magnitude of multichannel audio signals and compressing the multichannel audio signals at a low bit rate while providing compatibility with a conventional stereo audio compression standard such as advanced audio coding (AAC) and MP3 has been progressed. To be specific, standardization of the SAC technology based on the BCC has been progressed under the title “MPEG Surround”. Herein, channel level difference (CLD) as the same definition as the ICLD is used as a spatial parameter and only the ICC excluding the ICTD is additionally used.
The MPEG Surround is a parametric multichannel audio compression technology for presenting M audio signals based on side information including N audio signals (M>N) and spatial parameters where a human being determines a position of a sound source. An MPEG Surround encoder downmixes the multichannel audio signal into a mono or stereo channel, compresses the downmixed audio signal into a conventional MPEG-4 audio tool such as MPEG-4 AAC and MPEG-4 HE-AAC, extracts a spatial parameter from the multichannel audio signal, and multiflexes the spatial parameter with the encoded downmix audio signal. An MPEG Surround decoder separates the downmix audio signal from the spatial parameter by using a de-multiflexer and synthesizes the multichannel audio signal by applying the spatial parameter to the downmix audio signal.
A graphic equalizer using a frequency analyzer is mainly applied as a method for simultaneously listening and visualizing typical mono or stereo-based contents.
In case of multichannel, visualization by using only the graphic equalizer based on the frequency analyzer has a limitation in representing dynamic sound scene to a user. Also, the multichannel visualization method only applies the basic visualization method of the size of each channel signal. Although the multichannel audio signal can provide the position of diverse sound images on space, there is a problem that a position of the sound image created by the current multichannel signal is recognized and played as a unique thing by the decoder.