A conventional technology for encoding and decoding an audio signal does not combine different types of audio objects such as a mono-channel audio object, a stereo channel audio object, and a multi-channel audio object. That is, the conventional audio signal encoding and decoding technology did not allow a user to consume one type of audio contents in diverse ways. Accordingly, a user has passively consumed the audio contents.
A spatial audio coding (SAC) technology encodes a multi-channel audio signal into a down-mixed mono-channel signal or a down-mixed stereo channel signal with spatial cue information and transmits a high quality multi-channel signal even at a low bit rate. The SAC technology also analyzes an audio signal by each sub-band and restores an original multi-channel audio signal from the down-mixed mono-channel signal or the down-mixed stereo channel signal based on spatial cue information corresponding to each sub-band. The spatial cue information includes information for restoring an original signal in a decoding process and decides the quality of an audio signal to be reproduced in a SAC decoding apparatus. MPEG has been progressed the standardization of the SAC technology as MPEG Surround (MPS) and has used channel level difference as a main spatial cue.
Since the SAC technology allows encoding and decoding a multi-channel audio signal formed of only one audio object type, it is impossible to encode or decode an audio signal having various types of audio objects such as a mono-channel audio object, a stereo channel audio object, or a multi-channel audio object such as 5.1 channels using the SAC technology.
A binaural cue coding (BCC) technology according to the prior art was introduced to encode or decode a multi-object audio signal formed of mono-channel audio objects. However, a multi-object audio signal formed of multiple channel audio objects could not be encoded or decoded using the binaural cue coding BCC technology.
As described above, the conventional audio encoding and decoding technologies cannot be used to encode or decode a multi-object audio signal having multi-channel audio objects although a single object audio signal formed of multi-channel audio objects or a multi-object audio signal formed of mono-channel audio objects. Therefore, a plurality of different channel audio objects cannot be combined based on the conventional audio encoding and decoding technologies. That is, a user could not consume one type of audio contents in various ways. The conventional audio encoding and decoding technology allows a user only to passively consume audio contents.