MPEG Surround is one of major advances in audio coding recently standardized by MPEG, see ISO/IEC 23003-1 MPEG Surround. MPEG Surround is a multi-channel audio coding tool that allows existing mono- and stereo-based coders to be extended to multi-channel. The MPEG Surround encoder typically creates a mono or stereo downmix from the multi-channel input signal, and derives spatial parameters from the multi-channel input signal. The downmix and spatial parameters are encoded in separate streams. However, the spatial parameters stream can be embedded in the downmix stream. The MPEG Surround decoder decodes the spatial parameters that are used to upmix the decoded downmix in order to obtain the multi-channel output signal. Since the spatial image of the multi-channel input signal is parameterized, MPEG Surround allows decoding the encoded stereo downmix onto other rendering devices, such as these comprising a reproduction on headphones. This particular mode of operation is referred to as the MPEG Surround binaural decoding process in which the spatial parameters are combined with the Head Related Transfer Function (HRTF) data (J. Breebaart, Analysis and Synthesis of Binaural Parameters for Efficient 3D Audio Rendering in MPEG Surround, ICME 07) to produce the so-called binaural output. In this mode a realistic surround experience can be provided using regular headphones. Traditionally HRTF data is typically described as a set of pairs of impulse responses going from each speaker to both ears.
When the MPEG Surround binaural decoder is operated in a Low Power (LP) mode it can be implemented in mobile devices. In this mode in an offline process the raw HRTF data has been converted to a parametric domain allowing processing using low computational complexity. However, a disadvantage of the LP mode is that the parametric HRTF data represents typically only an anechoic portion of the raw HRTF data, i.e. it only covers a part of complete time domain responses which is primarily associated to directional cues. In practice, this means that the binaural decoder output signal will contain directional information, but will not sound very natural since there is hardly any externalization, which is primarily associated with the echoic part of the HRTF data. In order to compensate this lack of externalization, the MPEG Surround standard allows a use of a reverberation, as prescribed in ISO/IEC 23003-1 MPEG Surround Annex D. In such case, the MPEG Surround binaural decoder is extended with parallel reverberation. The input stereo downmix is fed to the reverberation process. The output of this process is directly added to the MPEG Surround binaural output. With such a parallel reverberation signal that is typically omni-directional, i.e. independent of direction, the echoic part is created and thus a more realistic surround experience is created.
However subjective tests with a reverberation, which is a type of a so-called send effect, added to the binaural output signal do not show satisfactory performance. One of the prominent artifacts in such binaural output is that when the original multi-channel encoder content is primarily present in the center channel, the binaural output signal sounds too reverberant.
A similar disadvantage holds for other send effects such as e.g. chorus, vocal doubler, fuzz, space expander, etc.