HFR technologies, such as the Spectral Band Replication (SBR) technology, allow to significantly improve the coding efficiency of traditional perceptual audio codecs. In combination with MPEG-4 Advanced Audio Coding (AAC) it forms a very efficient audio codec, which is already in use within the XM Satellite Radio system and Digital Radio Mondiale, and also standardized within 3GPP, DVD Forum and others. The combination of AAC and SBR is called aacPlus. It is part of the MPEG-4 standard where it is referred to as the High Efficiency AAC Profile (HE-AAC). In general, HFR technology can be combined with any perceptual audio codec in a back and forward compatible way, thus offering the possibility to upgrade already established broadcasting systems like the MPEG Layer-2 used in the Eureka DAB system. HFR transposition methods can also be combined with speech codecs to allow wide band speech at ultra low bit rates.
The basic idea behind HRF is the observation that usually a strong correlation between the characteristics of the high frequency range of a signal and the characteristics of the low frequency range of the same signal is present. Thus, a good approximation for the representation of the original input high frequency range of a signal can be achieved by a signal transposition from the low frequency range to the high frequency range.
This concept of transposition was established in WO 98/57436 which is incorporated by reference, as a method to recreate a high frequency band from a lower frequency band of an audio signal. A substantial saving in bit-rate can be obtained by using this concept in audio coding and/or speech coding. In the following, reference will be made to audio coding, but it should be noted that the described methods and systems are equally applicable to speech coding and in unified speech and audio coding (USAC).
In a HFR based audio coding system, a low bandwidth signal is presented to a core waveform coder for encoding, and higher frequencies are regenerated at the decoder side using transposition of the low bandwidth signal and additional side information, which is typically encoded at very low bit-rates and which describes the target spectral shape. For low bit-rates, where the bandwidth of the core coded signal is narrow, it becomes increasingly important to reproduce or synthesize a high band, i.e. the high frequency range of the audio signal, with perceptually pleasant characteristics.
In prior art there are several methods for high frequency reconstruction using, e.g. harmonic transposition, or time-stretching. One method is based on phase vocoders operating under the principle of performing a frequency analysis with a sufficiently high frequency resolution. A signal modification is performed in the frequency domain prior to re-synthesising the signal. The signal modification may be a time-stretch or transposition operation.
One of the underlying problems that exist with these methods are the opposing constraints of an intended high frequency resolution in order to get a high quality transposition for stationary sounds, and the time response of the system for transient or percussive sounds. In other words, while the use of a high frequency resolution is beneficial for the transposition of stationary signals, such high frequency resolution typically requires large window sizes which are detrimental when dealing with transient portions of a signal. One approach to deal with this problem may be to adaptively change the windows of the transposer, e.g. by using window-switching, as a function of input signal characteristics. Typically long windows will be used for stationary portions of a signal, in order to achieve high frequency resolution, while short windows will be used for transient portions of the signal, in order to implement a good transient response, i.e. a good temporal resolution, of the transposer. However, this approach has the drawback that signal analysis measures such as transient detection or the like have to be incorporated into the transposition system. Such signal analysis measures often involve a decision step, e.g. a decision on the presence of a transient, which triggers a switching of signal processing. Furthermore, such measures typically affect the reliability of the system and they may introduce signal artifacts when switching the signal processing, e.g. when switching between window sizes.
The present invention solves the aforementioned problems regarding the transient performance of harmonic transposition without the need for window switching. Furthermore, improved harmonic transposition is achieved at a low additional complexity.