The present invention relates to an apparatus and a method for generating bandwidth extension (BWE) output data, an audio encoder and an audio decoder.
Natural audio coding and speech coding are two major classes of codecs for audio signals. Natural audio coding is commonly used for music or arbitrary signals at medium bit rates and generally offers wide audio bandwidths. Speech coders are basically limited to speech reproduction and may be used at very low bit rate. Wide band speech offers a major subjective quality improvement over narrow band speech. Further, due to the tremendous growth of the multimedia field, transmission of music and other non-speech signals as well as storage and, for example, transmission for radio/TV at high quality over telephone systems is a desirable feature.
To drastically reduce the bit rate, source coding can be performed using split-band perceptual audio codecs. These natural audio codecs exploit perceptual irrelevance and statistical redundancy in the signal. In case exploitation of the above alone is not sufficient with respect to the given bit rate constraints the sample rate is reduced. It is also common to decrease the number of composition levels, allowing occasional audible quantization distortion, and to employ degradation of the stereo field through joint stereo coding or parametric coding of two or more channels. Excessive use of such methods results in annoying perceptual degradation. In order to improve the coding performance, bandwidth extension methods such as spectral band replication (SBR) is used as an efficient method to generate high frequency signals in an HFR (high frequency reconstruction) based codec.
In recording and transmitting acoustic signals a noise floor such as background noise is always present. In order to generate an authentic acoustic signal on the decoder side, the noise floor should either be transmitted or be generated. In the latter case, the noise floor in the original audio signal should be determined. In spectral band replication, this is performed by SBR tools or SBR related modules, which generate parameters that characterize (besides other things) the noise floor and that are transmitted to the decoder to reconstruct the noise floor.
In WO 00/45379, an adaptive noise floor tool is described, which provides sufficient noise contents in the synthesized high band frequency components. However, disturbing artifacts in the high band frequency components are generated if, in the base band, short-time energy fluctuations or so-called transients occur. These artifacts are perceptually not acceptable and known technology does not provide an acceptable solution (especially if the bandwidth is limited).