The modern communications era has brought about a tremendous expansion of wireline and wireless networks. Computer networks, television networks, and telephony networks are experiencing an unprecedented technological expansion, fueled by consumer demand. Wireless and mobile networking technologies have addressed related consumer demands, while providing more flexibility and immediacy of information transfer.
Current and future networking technologies continue to facilitate ease of information transfer and convenience to users. One area in which there is a demand to increase ease of information transfer relates to provision of devices capable of delivering a quality audio representation of audible content or audible communications. Multi-channel audio coding, which involves the coding of two or more audio channels together, is one example of a mechanism aimed at improving device capabilities with respect to providing quality audio signals. In particular, since in many usage scenarios the channels of the input signal may have relatively similar content, joint coding of channels may enable relatively efficient coding and with a lower bit-rate than that which may otherwise be utilized for coding each channel separately.
A recent multi-channel coding method is known as parametric stereo—or parametric multi-channel—coding. Parametric multi-channel coding generally computes one or more mono signals—often referred to as down-mix signals—as a linear combination of set of input signals. Each of the mono signals may be coded using a conventional mono audio coder. In addition to creating and coding the mono signals, the parametric multi-channel audio coder may extract a parametric representation of the channels of the input signal. Parameters may comprise information on level, phase, time, coherence differences, or the like, between input channels. At the decoder side, the parametric information may be utilized to create a multi-channel output signal from the received decoded mono signals.
Parametric multi-channel coding methods, which represent one example of a multi-channel coding method, such as Binaural Cue Coding (BCC) enable high-quality stereo or multi-channel reproduction with a reasonable bit-rate. The compression of a spatial image is based on generating and transmitting one or several down-mixed signals derived from a set of input signals, together with a set of spatial cues. Consequently, the decoder may use the received down-mixed signal(s) and spatial cues for synthesizing a set of channels, which is not necessarily the same number of channels as in the input signal, with spatial properties as described by the received spatial cues.
The spatial cues typically comprise Inter-Channel Level Difference (ICLD), Inter-Channel Time Difference (ICTD) and Inter-Channel Coherence/Correlation (ICC). ICLD and ICTD typically describe the signal(s) from the actual audio source(s), whereas the ICC is typically directed to enhancing the spatial sensation by introducing the diffuse component of the audio image, such as reverberations, ambience, etc. Spatial cues are typically provided for each frequency band separately. Furthermore, the spatial cues can be computed or provided between an arbitrary channel pair, e.g. between a chosen reference channel and each “sub-channel”.
Binaural signals are a special case of stereo signals that represent three dimensional audio image. Such signals model the time difference between the channels and the “head shadow effect”, which may be accomplished, e.g., via reduction of volume in certain frequency bands. In some cases, binaural audio signals can be created either by using a dummy head or other similar arrangement for recording the audio signal, or they can be created from pre-recorded audio signals by using special filtering implementing a head-related transfer function (HRTF) aiming to model the “head shadow effect” for providing suitably modified signals to both ears.
Since the correct representation of the time and amplitude differences between the channels of the encoded audio signal is an important factor on the resulting perceived audio quality in multi-channel audio coding in general and in binaural coding in particular, it may be desirable to introduce a mechanism paying special attention to these aspects.