1. Field
The present invention generally relates to digital signal processing and, more specifically, to a method and system for providing stereo-channel based multi-channel audio coding.
2. Background
Multi-channel audio transmission techniques are increasingly used in modern multi-media and communication systems. However, delivering multi-channel audio contents in mobile multi-media systems, such as, handheld devices in an efficient manner remains difficult. This is because multi-channel coding systems require a much higher bit rate and are more complex than stereo-channel or mono-channel systems. To handle this problem, a spatial audio coding method has recently been proposed by ISO/MPEG. This coding method can deliver a low bit presentation of multi-channel signals by transmitting a downmix signal along with some compact surround information, such as, binaural cues and spatial information, which describes the most salient properties of the multi-channel signals. Furthermore, the spatial audio coding method produces signals that are backward compatible with existing transmission systems.
FIG. 1 is a simplified schematic diagram illustrating a spatial surround coding system 10 recently developed by ISO/MPEG. The surround coding system 10 includes an encoder side 12 and a decoder side 14. The encoder side 12 further includes a downmix operation unit 16, a stereo-channel encoder 18 and a side information processing unit 20. The decoder side 14 further includes a stereo-channel decoder 22 and a surround synthesis processing unit 24.
The downmix operation unit 12 accomplishes the linear mapping from N-channel signals to stereo-channel with a 2×N coefficient matrix. After this mapping, the stereo-channel signals can be coded by the stereo-channel encoder 18, such as, an AAC encoder or MP3 encoder. The stereo-channel encoder 18 then generates data that is in stereo-compressed (two-channel) format. The side information processing unit 20 extracts and codes side information including the most important binaural cues and sound spatial information, such as, inter-channel level difference (ICLD), inter-channel time difference (ICTD) and inter-channel coherence (ICC) among these N channels. Side information can be represented and transmitted with a rate of only a few kb/s. As a result, the total data that will be transmitted to the decoder side 14 includes data in stereo-compressed format and the side information.
On the decoder side 14, the stereo-channel decoder 22 first decodes the stereo-compressed data. The decoded or decompressed data is forwarded to the surround synthesis processing unit 24. The surround synthesis processing unit 24 then uses signal synthesis (inverse processing corresponding to the extraction part on the encoder side 12) to combine the side information (such as, ICTD, ICLD and ICC) with the decompressed data to derive the N-channel signals for playback.
For the headphone or the case where there are only two speakers on the playback side, two options are available on the decoder side 14 to handle the stereo-channel signals. One option is that the stereo-channel decoder 22 directly outputs the stereo-channel signals, x^_l(n) and x^ r(n), to the headphone or two speakers. Such direct output, however, will not produce any significant surround effect since binaural and spatial information are not included in these stereo-channel signals. The other option, as shown in FIG. 2, is to use a virtual surround mapping unit 26 to map the synthesized N-channel signals to two channels, s^_l(n) and s^_r(n). This can deliver multi-channel surround effect for the headphone or the listeners in the sweet-spot of two speakers. By using the virtual surround mapping unit 26, however, additional processing resources are needed on the decoder side 14.
The surround synthesis processing unit 24 and the virtual surround mapping unit 26 perform very intensive computations. As a result, it is very difficult and cost inefficient to implement and include these units 24, 26 in portable devices, thereby preventing portable devices from delivering multi-channel surround effect in many mobile multi-media systems.
Hence, it would be desirable to provide a coding system which, amongst other things, allows portable devices with existing stereo-channel decoders to deliver multi-channel contents for headphones without adding any processing resources.