An audio decoder conforming to the High-Efficiency Advanced Audio Coding (HE-AAC) standard is typically designed to decode and output up to N channels of audio data which are to be reproduced by individual speakers at predefined positions. A HE-AAC encoded bitstream typically comprises data relating to N low band signals corresponding to the N audio channels, as well as encoded SBR (Spectral Band Replication) parameters for the reconstruction of N high band signals corresponding to the respective low band signals.
In certain situations it may be desirable for an HE-AAC decoder to reduce the number of output channels to M channels (M being smaller than N) while preserving audio events from all N channels. One exemplary use case of such channel reduction is a mobile device which can play back N channels when connected to a multi-channel home theater system but which is limited to its built-in mono or stereo output when used standalone.
A possible way of producing M output or target channels from N input or source channels is a time domain downmix of the decoded N-channel signal. In such systems, the encoded bitstream representing the N channels is first decoded to yield N time domain audio signals which are subsequently downmixed in the time-domain to M audio signals corresponding to M channels. The downside of this approach is the amount of computational and memory resources needed for first decoding all N audio signals corresponding to N channels, and subsequently downmixing the N decoded audio signals to M downmixed audio signals.
The ETSI technical specification (TS) 126 402 (3GPP TS 26.402) describes in section 6 a method called “SBR stereo parameter to mono parameter downmix”. This document is incorporated by reference. The ETSI technical specification describes an SBR parameter merging process to derive a mono SBR channel from an SBR channel pair. The specified method is, however, limited to a stereo to mono downmix where the channels are represented as a channel pair element (CPE).
In view of the above there is a need for a low complexity downmixing scheme from an arbitrary number N of channels to an arbitrary number M of channels. In particular, there is a need for a downmixing scheme for the SBR parameters associated with the N channels to SBR parameters associated with the M channels, wherein the downmixing scheme preserves the relevant high frequency information of the different channels.