In PCT WO 98/57436 the concept of transposition was established as a method to recreate a high frequency band from a lower frequency band of an audio signal. A substantial saving in bitrate can be obtained by using this concept in audio coding. In an HFR based audio coding system, a low bandwidth signal is processed by a core waveform coder and the higher frequencies are regenerated using transposition and additional side information of very low bitrate describing the target spectral shape at the decoder side. For low bitrates, where the bandwidth of the core coded signal is narrow, it becomes increasingly important to recreate a high band with perceptually pleasant characteristics. The harmonic transposition defined in PCT WO 98/57436 performs very well for complex musical material in a situation with low crossover frequency. The principle of a harmonic transposition is that a sinusoid with frequency ω is mapped to a sinusoid with frequency T ω where T>1 is an integer defining the order of transposition. In contrast to this, a single sideband modulation (SSB) based HFR method maps a sinusoid with frequency ω to a sinusoid with frequency ω+Δω where Δω is a fixed frequency shift. Given a core signal with low bandwidth, a dissonant ringing artifact can result from SSB transposition.
In order to reach the best possible audio quality, state of the art high quality harmonic HFR methods employ complex modulated filter banks, e.g. a Short Time Fourier Transform (STFT), with high frequency resolution and a high degree of oversampling to reach the needed audio quality. The fine resolution is needed to avoid unwanted intermodulation distortion arising from nonlinear processing of sums of sinusoids. With sufficiently high frequency resolution, i.e. narrow subbands, the high quality methods aim at having a maximum of one sinusoid in each subband. A high degree of oversampling in time is needed to avoid alias type of distortion, and a certain degree of oversampling in frequency is needed to avoid pre-echoes for transient signals. The obvious drawback is that the computational complexity can become high.
Subband block based harmonic transposition is another HFR method used to suppress intermodulation products, in which case a filter bank with coarser frequency resolution and a lower degree of oversampling is employed, e.g. a multichannel QMF bank. In this method, a time block of complex subband samples is processed by a common phase modifier while the superposition of several modified samples forms an output subband sample. This has the net effect of suppressing intermodulation products which would otherwise occur when the input subband signal consists of several sinusoids. Transposition based on block based subband processing has much lower computational complexity than the high quality transposers and reaches almost the same quality for many signals. However, the complexity is still much higher than for the trivial SSB based HFR methods, since a plurality of analysis filter banks, each processing signals of different transposition orders T, are needed in a typical HFR application in order to synthesize the needed bandwidth. Additionally, a common approach is to adapt the sampling rate of the input signals to fit analysis filter banks of a constant size, albeit the filter banks process signals of different transposition orders. Also common is to apply bandpass filters to the input signals in order to obtain output signals, processed from different transposition orders, with non-overlapping power spectral densities.
Storage or transmission of audio signals is often subject to strict bitrate constraints. In the past, coders were forced to drastically reduce the transmitted audio bandwidth when only a very low bitrate was available. Modern audio codecs are nowadays able to code wideband signals by using bandwidth extension (BWE) methods [1-12]. These algorithms rely on a parametric representation of the high-frequency content (HF) which is generated from the low-frequency part (LF) of the decoded signal by means of transposition into the HF spectral region (“patching”) and application of a parameter driven post processing. The LF part is coded with any audio or speech coder. For example, the bandwidth extension methods described in [1-4] rely on single sideband modulation (SSB), often also termed the “copy-up” method, for generating the multiple HF patches.
Lately, a new algorithm, which employs a bank of phase vocoders [15-17] for the generation of the different patches, has been presented [13] (see FIG. 20). This method has been developed to avoid the auditory roughness which is often observed in signals subjected to SSB bandwidth extension. However, since the BWE algorithm is performed on the decoder side of a codec chain, computational complexity is a serious issue. State-of-the-art methods, especially the phase vocoder based HBE, comes at the prize of a largely increased computational complexity compared to SSB based methods.
As outlined above, existing bandwidth extension schemes apply only one patching method on a given signal block at a time, be it SSB based patching [1-4] or HBE vocoder based patching [15-17]. Additionally, modern audio coders [19-20] offer the possibility of switching the patching method globally on a time block basis between alternative patching schemes.
SSB copy-up patching introduces unwanted roughness into the audio signal, but is computationally simple and preserves the time envelope of transients. Moreover, the computational complexity is significantly increased over the computational very simple SSB copy-up method.