In recent years, a demand for storage and transmission of audio contents has been steadily increasing. Moreover, the quality requirements for the storage and transmission of audio contents has also been increasing steadily. Accordingly, the concepts for the encoding and decoding of audio content have been enhanced. For example, the so-called “advanced audio coding” (AAC) has been developed, which is described, for example, in the International Standard ISO/IEC 13818-7:2003. Moreover, some spatial extensions have been created, like, for example, the so-called “MPEG Surround”-concept which is described, for example, in the international standard ISO/IEC 23003-1:2007. Moreover, additional improvements for the encoding and decoding of spatial information of audio signals are described in the international standard ISO/IEC 23003-2:2010, which relates to the so-called spatial audio object coding (SAOC).
Moreover, a flexible audio encoding/decoding concept, which provides the possibility to encode both general audio signals and speech signals with good coding efficiency and to handle multi-channel audio signals, is defined in the international standard ISO/IEC 23003-3:2012, which describes the so-called “unified speech and audio coding” (USAC) concept.
In MPEG USAC [1], joint stereo coding of two channels is performed using complex prediction, MPS 2-1-1 or unified stereo with band-limited or full-band residual signals.
MPEG surround [2] hierarchically combines OTT and TTT boxes for joint coding of multichannel audio with or without transmission of residual signals.
However, there is a desire to provide an even more advanced concept for an efficient encoding and decoding of three-dimensional audio scenes.