Parametric multi-channel audio coding is described in Faller, C., Baumgarte, F.: “Efficient representation of spatial audio using perceptual parametrization”, Proc. IEEE Workshop on Appl. of Sig. Proc. to Audio and Acoust., October 2001, pp. 199-202. Downmixed audio signals may be upmixed to synthesize multi-channel audio signals, using spatial cues to generate more output audio channels than downmixed audio signals. Usually, the downmixed audio signals are generated by superposition of a plurality of audio channel signals of a multi-channel audio signal, for example a stereo audio signal. The downmixed audio signals are waveform coded and put into an audio bitstream together with auxiliary data relating to the spatial cues. The decoder uses the auxiliary data to synthesize the multi-channel audio signals based on the waveform coded audio channels.
There are several spatial cues or parameters that may be used for synthesizing multi-channel audio signals. First, the inter-channel level difference (ILD) indicates a difference between the levels of audio signals on two channels to be compared. Second, the inter-channel time difference (ITD) indicates the difference in arrival time of sound between the ears of a human listener. The ITD value is important for the localization of sound, as it provides a cue to identify the direction or angle of incidence of the sound source relative to the ears of the listener. Third, the inter-channel phase difference (IPD) specifies the relative phase difference between the two channels to be compared. A subband IPD value may be used as an estimate of the subband ITD value. Finally, inter-channel coherence (ICC) is defined as the normalized inter-channel cross-correlation after a phase alignment according to the ITD or IPD. The ICC value may be used to estimate the width of a sound source.
ILD, ITD, IPD and ICC are important parameters for spatial multi-channel coding/decoding, in particular for stereo audio signals and especially binaural audio signals. ITD may for example cover the range of audible delays between −1.5 milliseconds (ms) to 1.5 ms. IPD may cover the full range of phase differences between −π and π. ICC may cover the range of correlation and may be specified in a percentage value between 0 and 1 or other correlation factors between −1 and +1. In current parametric stereo coding schemes, ILD, ITD, IPD and ICC are usually estimated in the frequency domain. For every subband, ILD, ITD, IPD and ICC are calculated, quantized, included in the parameter section of an audio bitstream and transmitted.
Due to restrictions in bitrates for parametric audio coding schemes there are sometimes not enough bits in the parameter section of the audio bitstream to transmit all of the values of the spatial coding parameters. For example, the document U.S. Patent Application Publication 2006/0153408 A1 discloses an audio encoder wherein combined cue codes are generated for a plurality of audio channels to be included as side information into a downmixed audio bitstream. The document U.S. Pat. No. 8,054,981 B2 discloses a method for spatial audio coding using a quantization rule associated with the relation of levels of an energy measure of an audio channel and the energy measure of a plurality of audio channels.