1. Field of the Invention
The present invention relates to the encoding of audio signals and the subsequent synthesis of auditory scenes from the encoded audio data.
2. Description of the Related Art
Multi-channel surround audio systems have been standard in movie theaters for years. As technology has advanced, it has become affordable to produce multi-channel surround systems for home use. Today, such systems are mostly sold as “home theater systems.” Conforming to an ITU-R recommendation, the vast majority of these systems provide five regular audio channels and one low-frequency sub-woofer channel (denoted the low-frequency effects or LFE channel). Such multi-channel system is denoted a 5.1 surround system. There are other surround systems, such as 7.1 (seven regular channels and one LFE channel) and 10.2 (ten regular channels and two LFE channels).
C. Faller and F. Baumgarte, “Efficient representation of spatial audio coding using perceptual parameterization,” IEEE Workshop on Appl. of Sig. Proc. to Audio and Acoust., October 2001, and C. Faller and F. Baumgarte, “Binaural Cue Coding Applied to Stereo and Multi-Channel Audio Compression,” Preprint 112th Conv. Aud. Eng. Soc., May 2002, (collectively, “the BCC papers”) the teachings of both of which are incorporated herein by reference, describe a parametric multi-channel audio coding technique (referred to as BCC coding).
FIG. 1 shows a block diagram of an audio processing system 100 that performs binaural cue coding (BCC) according to the BCC papers. BCC system 100 has a BCC encoder 102 that receives C audio input channels 108, for example, one from each of C different microphones 106. BCC encoder 102 has a downmixer 110, which converts the C audio input channels into a mono audio sum signal 112.
In addition, BCC encoder 102 has a BCC analyzer 114, which generates BCC cue code data stream 116 for the C input channels. The BCC cue codes (also referred to as auditory scene parameters) include inter-channel level difference (ICLD) and inter-channel time difference (ICTD) data for each input channel. BCC analyzer 114 performs band-based processing to generate ICLD and ICTD data for each of one or more different frequency sub-bands (e.g., different critical bands) of the audio input channels.
BCC encoder 102 transmits sum signal 112 and the BCC cue code data stream 116 (e.g., as either in-band or out-of-band side information with respect to the sum signal) to a BCC decoder 104 of BCC system 100. BCC decoder 104 has a side-information processor 118, which processes data stream 116 to recover the BCC cue codes 120 (e.g., ICLD and ICTD data). BCC decoder 104 also has a BCC synthesizer 122, which uses the recovered BCC cue codes 120 to synthesize C audio output channels 124 from sum signal 112 for rendering by C loudspeakers 126, respectively.
Audio processing system 100 can be implemented in the context of multi-channel audio signals, such as 5.1 surround sound. In particular, downmixer 110 of BCC encoder 102 would convert the six input channels of conventional 5.1 surround sound (i.e., five regular channels+one LFE channel) into sum signal 112. In addition, BCC analyzer 114 of encoder 102 would transform the six input channels into the frequency domain to generate the corresponding BCC cue codes 116. Analogously, side-information processor 118 of BCC decoder 104 would recover the BCC cue codes 120 from the received side information stream 116, and BCC synthesizer 122 of decoder 104 would (1) transform the received sum signal 112 into the frequency domain, (2) apply the recovered BCC cue codes 120 to the sum signal in the frequency domain to generate six frequency-domain signals, and (3) transform those frequency-domain signals into six time-domain channels of synthesized 5.1 surround sound (i.e., five synthesized regular channels+one synthesized LFE channel) for rendering by loudspeakers 126.