Audio coding is the domain of compression that deals with exploiting redundancy and irrelevancy in audio signals. In MPEG USAC [ISO/IEC 23003-3:2012—Information technology—MPEG audio technologies Part 3: Unified speech and audio coding], joint stereo coding of two channels is performed using complex prediction, MPS 2-1-2 or unified stereo with band-limited or full-band residual signals. MPEG surround [ISO/IEC 23003-1:2007—Information technology—MPEG audio technologies Part 1: MPEG Surround] hierarchically combines OTT and TTT boxes for joint coding of multi-channel audio with or without transmission of residual signals. MPEG-H Quad Channel Elements hierarchically apply MPS 2-1-2 stereo boxes followed by complex prediction/MS stereo boxes building a fixed 4×4 remixing tree. AC4 [ETSI TS 103 190 V1.1.1 (2014-04)—Digital Audio Compression (AC-4) Standard] introduces new 3-, 4- and 5-channel elements that allow for remixing transmitted channels via a transmitted mix matrix and subsequent joint stereo coding information. Further, prior publications suggest to use orthogonal transforms like Karhunen-Loeve Transform (KLT) for enhanced multi-channel audio coding [Yang, Dai and Ai, Hongmei and Kyriakakis, Chris and Kuo, C.-C. Jay, 2001: Adaptive Karhunen-Loeve Transform for Enhanced Multichannel Audio Coding, http://ict.usc.edu/pubs/Adaptive %20Karhunen-Loeve %20Transform %20for %20Enhanced %20Multichannel %20Audio %20Coding.pdf].
In the 3D audio context, loudspeaker channels are distributed in several height layers, resulting in horizontal and vertical channel pairs. Joint coding of only two channels as defined in USAC is not sufficient to consider the spatial and perceptual relations between channels. MPEG Surround is applied in an additional pre-/postprocessing step, residual signals are transmitted individually without the possibility of joint stereo coding, e.g. to exploit dependencies between left and right vertical residual signals. In AC-4 dedicated N-channel elements are introduced that allow for efficient encoding of joint coding parameters, but fail for generic speaker setups with more channels as proposed for new immersive playback scenarios (7.1+4, 22.2). MPEG-H Quad Channel element is also restricted to only 4 channels and cannot be dynamically applied to arbitrary channels but only a pre-configured and fixed number of channels.