Audio coding systems are known from the state of the art. They are used in particular for transmitting or storing audio signals.
FIG. 1 shows the basic structure of an audio coding system, which is employed for transmission of audio signals. The audio coding system comprises an encoder 10 at a transmitting side and a decoder 11 at a receiving side. An audio signal that is to be transmitted is provided to the encoder 10. The encoder is responsible for adapting the incoming audio data rate to a bitrate level at which the bandwidth conditions in the transmission channel are not violated. Ideally, the encoder 10 discards only irrelevant information from the audio signal in this encoding process. The encoded audio signal is then transmitted by the transmitting side of the audio coding system and received at the receiving side of the audio coding system. The decoder 11 at the receiving side reverses the encoding process to obtain a decoded audio signal with little or no audible degradation.
Alternatively, the audio coding system of FIG. 1 could be employed for archiving audio data. In that case, the encoded audio data provided by the encoder 10 is stored in some storage unit, and the decoder 11 decodes audio data retrieved from this storage unit. In this alternative, it is the target that the encoder achieves a bitrate which is as low as possible, in order to save storage space.
The original audio signal which is to be processed can be a mono audio signal or a multichannel audio signal containing at least a first and a second channel signal. An example of a multichannel audio signal is a stereo audio signal, which is composed of a left channel signal and a right channel signal.
Depending on the allowed bitrate, different encoding schemes can be applied to a stereo audio signal. The left and right channel signals can be encoded for instance independently from each other. But typically, a correlation exists between the left and the right channel signals, and the most advanced coding schemes exploit this correlation to achieve a further reduction in the bitrate.
Particularly suited for reducing the bitrate are low bitrate stereo extension methods. In a stereo extension method, the stereo audio signal is encoded as a high bitrate mono signal, which is provided by the encoder together with some side information reserved for a stereo extension. In the decoder, the stereo audio signal is then reconstructed from the high bitrate mono signal in a stereo extension making use of the side information. The side information typically takes only a few kbps of the total bitrate.
If a stereo extension scheme aims at operating at low bitrates, an exact replica of the original stereo audio signal cannot be obtained in the decoding process. For the thus required approximation of the original stereo audio signal, an efficient coding model is necessary.
The most commonly used stereo audio coding schemes are Mid Side (MS) stereo and Intensity Stereo (IS).
In MS stereo, the left and right channel signals are transformed into sum and difference signals, as described for example by J. D. Johnston and A. J. Ferreira in “Sum-difference stereo transform coding”, ICASSP-92 Conference Record, 1992, pp. 569-572. For a maximum coding efficiency, this transformation is done in both a frequency and a time dependent manner. MS stereo is especially useful for high quality, high bitrate stereophonic coding.
In the attempt to achieve lower bitrates, IS has been used in combination with this MS coding, where IS constitutes a stereo extension scheme. In IS coding, a portion of the spectrum is coded only in mono mode, and the stereo audio signal is reconstructed by providing in addition different scaling factors for the left and right channels, as described for instance in documents U.S. Pat. No. 5,539,829 and U.S. Pat. No. 5,606,618.
Two further, very low bitrate stereo extension schemes have been proposed with Binaural Cue Coding (BCC) and Bandwidth Extension (BWE). In BCC, described by F. Baumgarte and C. Faller in “Why Binaural Cue Coding is Better than Intensity Stereo Coding, AES112th Convention, May 10-13, 2002, Preprint 5575, the whole spectrum is coded with IS. In BWE coding, described in ISO/IEC JTC1/SC29/WG11 (MPEG-4), “Text of ISO/IEC 14496-3:2001/FPDAM 1, Bandwidth Extension”, N5203 (output document from MPEG 62nd meeting), October 2002, a bandwidth extension is used to extend the mono signal to a stereo signal.
Moreover, document U.S. Pat. No. 6,016,473 proposes a low bit-rate spatial coding system for coding a plurality of audio streams representing a soundfield. On the encoder side, the audio streams are divided into a plurality of subband signals, representing a respective frequency subband. Then, a composite signal representing the combination of these subband signals is generated. In addition, a steering control signal is generated, which indicates the principal direction of the soundfield in the subbands, e.g. in the form of weighted vectors. On the decoder side, an audio stream in up to two channels is generated based on the composite signal and the associated steering control signal.