The perceptually adapted encoding of audio signals, for efficient storage and transmission of these data rate reduced signals, has gained acceptance in many fields. Encoding algorithms are known, in particular as MPEG-1/2, layer 3 “MP3”, MPEG-2/4 Advanced Audio Coding (AAC) or MPEG-H Unified Speech and Audio Coding (USAC). The underlying coding techniques, in particular when achieving lowest bit rates, lead to a reduction of the audio quality. The impairment is often mainly caused by an encoder side limitation of the audio signal bandwidth to be transmitted.
In such a situation, it is known state-of-the-art to subject the audio signal to a band limiting on the encoder side, and to encode only a lower band of the audio signal by means of a high quality audio encoder. The upper band, however, is only very coarsely characterized by a set of parameters, which convey e.g. the spectral envelope of the upper band. On the decoder side, the upper band is then synthesized by patching the decoded lower band signal into the otherwise empty upper band and performing subsequent parameter controlled adjustments.
Standard methods for a bandwidth extension of band-limited audio signals use a copying function of low-frequency signal portions (LF) into the high frequency range (HF), in order to approximate information missing due to the band limitation. In principle, such a copying function is technically equivalent to a spectral shift computed in time domain by means of single sideband (SSB) modulation, but computationally much less complex. Such methods, like Spectral Band Replication (SBR), are described in M. Dietz, L. Liljeryd, K. Kjörling and O. Kunz, “Spectral Band Replication, a novel approach in audio coding,” in 112th AES Convention, Munich, May 2002; S. Meltzer, R. Böhm and F. Henn, “SBR enhanced audio codecs for digital broadcasting such as “Digital Radio Mondiale” (DRM),” 112th AES Convention, Munich, May 2002; T. Ziegler, A. Ehret, P. Ekstrand and M. Lutzky, “Enhancing mp3 with SBR: Features and Capabilities of the new mp3PRO Algorithm,” in 112th AES Convention, Munich, May 2002; International Standard ISO/IEC 14496-3:2001/FPDAM 1, “Bandwidth Extension,” ISO/IEC, 2002, or “Speech bandwidth extension method and apparatus”, Vasu Iyengar et al. U.S. Pat. No. 5,455,888.
In these methods no harmonic transposition is performed, but successive bandpass signals of the lower band are introduced into successive filterbank channels of the upper band. By this, a coarse approximation of the upper band of the audio signal is achieved. This coarse approximation of the signal is then in a further step approximated to the original by a post processing using control information gained from the original signal. Here, e.g. scale factors serve for adapting the spectral envelope, an inverse filtering and the addition of a noise floor for adapting tonality and a supplementation by sinusoidal signal portions, as it is also described in the MPEG-4 Standard.
It is known from harmonic bandwidth extensions techniques described in Nagel, F.; Disch, S. A Harmonic Bandwidth Extension Method for Audio Codecs, IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2009; Nagel, F.; Disch, S.; Rettelbach, N. A Phase Vocoder Driven Bandwidth Extension Method with Novel Transient Handling for Audio Codecs, 126th AES Convention, 2009; Zhong, H.; Villemoes, L.; Ekstrand, P. et al. QMF Based Harmonic Spectral Band Replication, 131st Audio Engineering Society Convention, 2011; Villemoes, L.; Ekstrand, P.; Hedelin, P. Methods for enhanced harmonic transposition, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, (WASPAA), 2011, that in synthesizing the upper band unwanted auditory roughness might be introduced into the signal. One cause (out of many) of said roughness is spectral misalignment of the patch and/or dissonance effects in the transition regions between lower band and first patch or between consecutive patches. Harmonic bandwidth extensions techniques are designed to improve on these two aspects, albeit at the expense of computational complexity.
Filterbank calculations and patching in the filterbank domain, especially in harmonic bandwidth extension, may indeed become a high computational effort. In WO 98/57436 an advanced patching technique is described which can, to some limited extent, avoid dissonance effects by introducing so-called guard bands between different spectral patches and by performing a modified copy-up patching to lessen spectral misalignment while keeping computational complexity moderate.
Apart from this, further methods exist such as the so-called “blind bandwidth extension”, described in E. Larsen, R. M. Aarts, and M. Danessis, “Efficient high-frequency bandwidth extension of music and speech”, In AES 112th Convention, Munich, Germany, May 2002 wherein no information on the original HF range is used. Further, also the method of the so-called “Artificial bandwidth extension”, exists which is described in K. Käyhkö, A Robust Wideband Enhancement for Narrowband Speech Signal; Research Report, Helsinki University of Technology, Laboratory of Acoustics and Audio signal Processing, 2001.
In J. Mäkinen et al.: AMR-WB+: a new audio coding standard for 3rd generation mobile audio services Broadcasts, IEEE, ICASSP '05, a method for bandwidth extension is described, wherein the copying operation of the bandwidth extension with an up-copying of successive bandpass signals according to SBR technology is replaced by mirroring, for example, by upsampling.
Further technologies for bandwidth extension are described in the following documents. R. M. Aarts, E. Larsen, and O. Ouweltjes, “A unified approach to low- and high frequency bandwidth extension”, AES 115th Convention, New York, USA, October 2003; E. Larsen and R. M. Aarts, “Audio Bandwidth Extension—Application to psychoacoustics, Signal Processing and Loudspeaker Design”, John Wiley & Sons, Ltd., 2004; E. Larsen, R. M. Aarts, and M. Danessis, “Efficient high-frequency bandwidth extension of music and speech”, AES 112th Convention, Munich, May 2002; J. Makhoul, “Spectral Analysis of Speech by Linear Prediction”, IEEE Transactions on Audio and Electroacoustics, AU-21(3), June 1973; U.S. patent application Ser. No. 08/951,029; U.S. Pat. No. 6,895,375.
Known methods of harmonic bandwidth extension show a high complexity. On the other hand, methods of complexity-reduced bandwidth extension show quality losses. In particular with a low bitrate and in combination with a low bandwidth of the LF range, artifacts such as roughness and a timbre perceived to be unpleasant may occur. A reason for this is primarily the fact that the approximated HF portion is based on one or more direct copy or mirror operations of the LF portion of the spectrum.