The present invention relates to postprocessing a decoded multi-channel audio signal and to postprocessing a decoded stereo audio signal, the postprocessing of the decoded stereo audio signal representing a specific case of postprocessing a decoded multi-channel audio signal.
In a conventional speech codec, classification of the speech signals is often performed to improve the coding efficiency of the speech signals. At the decoder side, different types of signal processing tools are used depending on the transmitted classification of the speech signals.
One classification is to distinguish between normal speech signals and transient speech signals. Transient signals are short duration signals and are characterized by a fast change in signal power and amplitude. The transient signals are, e.g., distinguished from “normal” or non-transient signals, e.g. signals with a longer duration and/or only minor changes in signal power and amplitude. This kind of classification is not limited to speech signals but is applicable to audio signals in general.
For transient signals, a common method is to extract the time envelope of the input signal in the encoder, transmit it and apply it in the decoder as a postprocessing.
For stereo signals, such a kind of postprocessing is often necessary, but there are conventionally not enough bits to encode the time envelope of both channels.
Referring to reference [1], low-bit-rate stereo coding is based on the extraction and quantization of a parametric representation of the stereo image. The parameters are then transmitted as side information together with a mono downmix signal encoded by a core coder. At the decoder, the stereo signal can be reconstructed based on the mono downmix signal and the side information, i.e. the stereo parameters containing the spatial (left and right) information of the stereo signal.
For a stereo codec, if the downmix mono signal is classified as transient, there may be pre-echo artefacts in the reconstructed stereo signal. Postprocessing may be done to improve the quality of this type of signal whose both channels are transient or only one channel is transient. But for a parametric stereo codec, there are conventionally not enough bits to encode the time envelope of both channels.
According to references [2] and [3], the input mono signal is classified into transient and normal categories in the encoder. Then, at the decoder side, based on the transmitted classification information, a time scaling synthesis algorithm is used to improve the quality. All those kinds of algorithms are applied to the mono downmix signal.
The limitation of the bandwidth available for transmitting signals is not only encountered for the transmission of stereo speech or audio signals but forms a general problem for multi-channel audio signal transmission, the stereo audio coding representing a specific case of multi-channel audio coding.