Efforts are underway to research and develop new approaches to perceptual coding of multi-channel audio, commonly referred to as Spatial Audio Coding (SAC). SAC allows transmission of multi-channel audio at low bit rates, making SAC suitable for many popular audio applications (e.g., Internet streaming, music downloads).
Rather than performing a discrete coding of individual audio input channels, SAC captures the spatial image of a multi-channel audio signal in a compact set of parameters. The parameters can be transmitted to a decoder where the parameters are used to synthesis or reconstruct the spatial properties of the audio signal.
In some SAC applications, the spatial parameters are transmitted to a decoder as part of a bitstream. The bitstream includes spatial frames that contain ordered sets of time slots for which spatial parameter sets can be applied. The bitstream also includes position information that can be used by a decoder to identify the correct time slot for which a given parameter set is applied.
Some SAC applications make use of conceptual elements in the encoding/decoding paths. One element is commonly referred to as One-To-Two (OTT) and another element is commonly referred to as Two-To-Three (TTT), where the names imply the number of input and output channels of a corresponding decoder element, respectively. The OTT encoder element extracts two spatial parameters and creates a downmix signal and residual signal. The TTT element mixes down three audio signals into a stereo downmix signal plus a residual signal. These elements can be combined to provide a variety of configurations of a spatial audio sound environment (e.g., surround sound).
Some SAC applications can operate in a non-guided operation mode, where only a stereo downmix signal is transmitted from an encoder to a decoder without a need for spatial parameter transmission. The decoder synthesizes spatial parameters from the downmix signal and uses those parameters to produce a multi-channel audio signal.