The perceptual coding of audio signals for the purpose of data reduction for efficient storage or transmission of these signals is a widely used practice. In particular, when highest efficiency is to be achieved, codecs that are closely adapted to the signal input characteristics are used. One example is the MPEG-D USAC core codec that can be configured to predominantly use ACELP (Algebraic Code-Excited Linear Prediction) coding on speech signals, TCX (Transform Coded Excitation) on background noise and mixed signals, and AAC (Advanced Audio Coding) on music content. All three internal codec configurations can be instantly switched in a signal adaptive way in response to the signal content.
Moreover, joint multichannel coding techniques (Mid/Side coding, etc.) or, for highest efficiency, parametric coding techniques are employed. Parametric coding techniques basically aim at the recreation of a perceptual equivalent audio signal rather than a faithful reconstruction of a given waveform. Examples encompass noise filling, bandwidth extension and spatial audio coding.
When combining a signal adaptive core coder and either joint multichannel coding or parametric coding techniques in state of the art codecs, the core codec is switched to match the signal characteristic, but the choice of multichannel coding techniques, such as M/S-Stereo, spatial audio coding or parametric stereo, remain fixed and independent of the signal characteristics. These techniques are usually employed to the core codec as a pre-processor to the core encoder and a post-processor to the core decoder, both being ignorant to the actual choice of core codec.
On the other hand, the choice of the parametric coding techniques for the bandwidth extension is sometimes made signal dependent. For example techniques applied in the time domain are more efficient for the speech signals while a frequency domain processing is more relevant for other signals. In such a case, the adopted multichannel coding techniques need to be compatible with the both types of bandwidth extension techniques.
Relevant topics in the state-of-art comprise:
PS and MPS as a pre-/post processor to the MPEG-D USAC core codec
MPEG-D USAC Standard
MPEG-H 3D Audio Standard
In MPEG-D USAC, a switchable core coder is described. However, in USAC, multichannel coding techniques are defined as a fixed choice that is common to entire core coder, independent of its internal switch of coding principles being ACELP or TCX (“LPD”), or AAC (“FD”). Therefore, if a switched core codec configuration is desired, the codec is limited to use parametric multichannel coding (PS) throughout for the entire signal. However, for coding e.g. music signals it would have been more appropriate to rather use a joint stereo coding, which can switch dynamically between L/R (left/right) and M/S (mid/side) scheme per frequency band and per frame.