The hearing adapted encoding of audio signals for data reduction for an efficient storage and transmission of these signals has gained acceptance in many fields. Encoding algorithms are known, for instance, as MPEG 1/2 LAYER 3 “MP3” or MPEG 4 AAC. The coding algorithm used for this, in particular when achieving lowest bit rates, leads to the reduction of the audio quality which is often mainly caused by an encoder side limitation of the audio signal bandwidth to be transmitted. A low-pass filtered signal is coded using a so-called core coder and the region with higher frequencies is parameterized so that they can approximately be reconstructed from the low-pass filtered signal.
It is known from WO 98 57436 to subject the audio signal to a band limiting in such a situation on the encoder side and to encode only a lower band of the audio signal by means of a high quality audio encoder. The upper band, however, is only very coarsely characterized, i.e. by a set of parameters which allow the reproduction of the original spectral envelope of the upper band. On the decoder side, the upper band is then synthesized. For this purpose, a harmonic transposition is proposed, wherein the lower band of the decoded audio signal is supplied to a filterbank. Filterbank channels of the lower band are connected to filterbank channels of the upper band, or are “patched”, and each patched bandpass signal is subjected to an envelope adjustment. The synthesis filterbank belonging to a special analysis filterbank here receives bandpass signals of the audio signal in the lower band and envelope-adjusted bandpass signals of the lower band which were harmonically patched into the upper band. The output signal of the synthesis filterbank is an audio signal extended with regard to its audio bandwidth which was transmitted from the encoder side to the decoder side with a very low data rate. In particular, filterbank calculations and patching in the filterbank domain may become a high computational effort.
Complexity-reduced methods for a bandwidth extension of band-limited audio signals instead use a copying function of low-frequency signal portions (LF) into the high-frequency range (HF), in order to approximate information missing due to the band limitation. Such methods are described in M. Dietz, L. Liljeryd, K. Kjörling and O. Kunz, “Spectral Band Replication, a novel approach in audio coding,” in 112th AES Convention, Munich, May 2002; S. Meltzer, R. Böhm and F. Henn, “SBR enhanced audio codecs for digital broadcasting such as “Digital Radio Mondiale” (DRM),” 112th AES Convention, Munich, May 2002; T. Ziegler, A. Ehret, P. Ekstrand and M. Lutzky, “Enhancing mp3 with SBR: Features and Capabilities of the new mp3PRO Algorithm,” in 112th AES Convention, Munich, May 2002; International Standard ISO/IEC 14496-3:2001/FPDAM 1, “Bandwidth Extension,” ISO/IEC, 2002, or “Speech bandwidth extension method and apparatus”, Vasu Iyengar et al. U.S. Pat. No. 5,455,888.
In these methods no harmonic transposition is performed, but adjacent bandpass filterbank channels of the lower band are artificially introduced into adjacent filterbank channels of the upper band. This leads to a coarse approximation of the upper band of the audio signal. This coarse approximation of the signal is then in a further step refined by defining additional control parameters deduced from the original signal. As an example, the MPEG-4 Standard uses scale factors for adjusting the spectral envelope, a combination of inverse filtering and addition of a noise floor for adapting the tonality, and insertions of sinusoidal signal portions for supplementation of tonal components.
Apart from this, further methods exist such as the so-called “blind bandwidth extension”, described in E. Larsen, R. M. Aarts, and M. Danessis, “Efficient high-frequency bandwidth extension of music and speech”, In AES 112th Convention, Munich, Germany, May 2002 wherein no information on the original HF range is used. Further, also the method of the so-called “Artificial bandwidth extension”, exists which is described in K. Käyhkö, A Robust Wideband Enhancement for Narrowband Speech Signal; Research Report, Helsinki University of Technology, Laboratory of Acoustics and Audio signal Processing, 2001.
In J. Makinen et al.: AMR-WB+: a new audio coding standard for 3rd generation mobile audio services Broadcasts, IEEE, ICASSP '05, a method for bandwidth extension is described, wherein the copying operation of low-frequency components into the high-band is performed by a mirroring operation obtained, for example, by upsampling the low-pass filtered signal.
As an alternative, a single side band modulation can be employed which is basically equivalent to a copying operation in the filterbank domain. Methods which enable a harmonic bandwidth extension usually employ a determination step of the pitch (pitch tracking), a non-linear distortion step (see, for example “U. Kornagel, Spectral widening of the excitation signal for telephone-band speech enhancement, in: Proceedings of the IWAENC, Darmstadt, Germany, September 2001, pp. 215-218”) or make use of phase vocoders as, for example, shown by the US provisional patent application “F. Nagel, S. Disch: “Apparatus and method of harmonic bandwidth extension in audio signals”” with the application number US 61/025,129.
The WO 02/41302 A1, for example, shows a method for enhancing the performance of coding systems that use high-frequency reconstruction methods. It shows how to improve the overall performance of such systems by means of an adaptation over time of the crossover frequency between the low band coded by a core coder and the high band coded by a high-frequency reconstruction system. For this method, the core coder may be able to work with different crossover frequencies at the encoder side as well as at the decoder side. Therefore, the complexity of the core coder is increased.
Further technologies for bandwidth extension are described, for example, in “R. M. Aarts, E. Larsen, and O. Ouweltjes, A unified approach to low- and high-frequency bandwidth extension. In AES 115th Convention, New York, USA, October 2003”, E. Larsen and R. M. Aarts: Audio Bandwidth Extension—Application to psychoacoustics, Signal Processing and Loudspeaker Design. John Wiley & Sons, Ltd, 2004”, E. Larsen, R. M. Aarts, and M. Danessis: Efficient high-frequency bandwidth extension of music and speech. In AES 112th Convention, Munich, Germany, May 2002”, “J. Makhoul: Spectral Analysis of Speech by Linear Prediction. IEEE Transactions on Audio and Electroacoustics, AU-21(3), June 1973”, “U.S. patent application Ser. No. 08/951,029, Ohmori et al.: Audio band width extending system and method” and “U.S. Pat. No. 6,895,375, Malah, D & Cox, R. VS.: System for bandwidth extension of Narrow-band speech”.
Harmonic bandwidth extension methods often exhibits a high complexity, while methods of complexity-reduced bandwidth extension show quality losses. In the particular case where a low bit rate is combined with a small bandwidth of the low band, artifacts such as roughness and a timbre perceived as unpleasant may occur. A reason for this is the fact that the approximated HF portion is based on a copying operation which does not maintain the harmonic relations between the tonal signal portions. This applies both, to the harmonic relation between LF and HF, and also to the harmonic relation between succeeding patches within the HF portion itself. For example, within SBR, the juxtaposition of the coded components and the replicated components, occurring at the boundary between the low and the high bands, may cause rough sound impressions. The reason is illustrated in FIGS. 18A and 18B where tonal portions copied from the LF range into the HF range are spectrally densely adjacent to tonal portions of the LF range.
FIG. 18A shows the original spectrogram 1800a of a signal consisting of three tones. Fittingly, FIG. 18B shows a diagram 1800b of the bandwidth extended signal corresponding to the original signal of FIG. 18A. The abscissa indicates time and the ordinate indicates frequency. In particular, at the last tone, potential problems 1810 can be observed (smeared lines 1810).
If harmonic relations are considered by known methods, this is done on the basis of an F0-estimation. In this cases, the success of these methods depends primarily on the reliability of this estimation.
In general, known bandwidth extension methods provide audio signals at a low bit rate, but with poor audio quality or a good audio quality at high bit rates.