Many consumer audio devices (e.g., stereos, media players, mobile phones, game consoles, etc.) allow users to modify stereo audio signals using controls for equalization (e.g., bass, treble), volume, acoustic room effects, etc. These modifications, however, are applied to the entire audio signal and not to the individual audio objects (e.g., instruments) that make up the audio signal. For example, a user cannot individually modify the stereo panning or gain of guitars, drums or vocals in a song without effecting the entire song.
Techniques have been proposed that provide mixing flexibility at a decoder. These techniques rely on a Binaural Cue Coding (BCC), parametric or spatial audio decoder for generating a mixed decoder output signal. None of these techniques, however, directly encode stereo mixes (e.g., professionally mixed music) to allow backwards compatibility without compromising sound quality.
Spatial audio coding techniques have been proposed for representing stereo or multi-channel audio channels using inter-channel cues (e.g., level difference, time difference, phase difference, coherence). The inter-channel cues are transmitted as “side information” to a decoder for use in generating a multi-channel output signal. These conventional spatial audio coding techniques, however, have several deficiencies. For example, at least some of these techniques require a separate signal for each audio object to be transmitted to the decoder, even if the audio object will not be modified at the decoder. Such a requirement results in unnecessary processing at the encoder and decoder. Another deficiency is the limiting of encoder input to either a stereo (or multi-channel) audio signal or an audio source signal, resulting in reduced flexibility for remixing at the decoder. Finally, at least some of these conventional techniques require complex de-correlation processing at the decoder, making such techniques unsuitable for some applications or devices.