The present invention relates to audio signal processing and, in particular, to a bandwidth extension encoder, a method for encoding an audio signal, a bandwidth extension decoder, a method for decoding an encoded audio signal, a phase vocoder and an audio signal.
Moreover, embodiments of the present invention relate to an application of a phase vocoder for pure time stretching, independent of a bandwidth extension.
Storage or transmission of audio signals is often subject to strict bit rate constraints. These constraints are usually accounted for by the use of encoders/decoders (“codecs”) that efficiently compress the audio signal in terms of the information rate needed to store or transmit the signal. In the past, coders were forced to drastically reduce the audio bandwidth when only a very low bit rate was available. Modern audio codecs are able to code wide-band signals by using bandwidth extension (BWE) methods, as described in M. Dietz, L. Liljeryd, K. Kjörling and O. Kunz, “Spectral Band Replication, a novel approach in audio coding,” in 112th AES Convention, Munich, May 2002; S. Meltzer, R. Böhm and F. Henn, “SBR enhanced audio codecs for digital broadcasting such as “Digital Radio Mondiale” (DRM),” in 112th AES Convention, Munich, May 2002; T. Ziegler, A. Ehret, P. Ekstrand and M. Lutzky, “Enhancing mp3 with SBR: Features and Capabilities of the new mp3PRO Algorithm,” in 112th AES Convention, Munich, May 2002; International Standard ISO/IEC 14496-3:2001/FPDAM 1, “Bandwidth Extension,” ISO/IEC, 2002; “Speech bandwidth extension method and apparatus”, Vasu Iyengar et al. U.S. Pat. No. 5,455,888; E. Larsen, R. M. Aarts, and M. Danessis. Efficient high-frequency bandwidth extension of music and speech. In AES 112th Convention, Munich, Germany, May 2002; R. M. Aarts, E. Larsen, and O. Ouweltjes. A unified approach to low- and high frequency bandwidth extension. In AES 115th Convention, New York, USA, October 2003; K. Käyhkö. A Robust Wideband Enhancement for Narrowband Speech Signal. Research Report, Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing, 2001; E. Larsen and R. M. Aarts. Audio Bandwidth Extension—Application to psychoacoustics, Signal Processing and Loudspeaker Design. John Wiley & Sons, Ltd, 2004; E. Larsen, R. M. Aarts, and M. Danessis. Efficient high-frequency bandwidth extension of music and speech. In AES 112th Convention, Munich, Germany, May 2002; J. Makhoul. Spectral Analysis of Speech by Linear Prediction. IEEE Transactions on Audio and Electroacoustics, AU-21(3), June 1973; U.S. patent application Ser. No. 08/951,029, Ohmori, et al. Audio band width extending system and method; U.S. Pat. No. 6,895,375, Malah, D & Cox, R. V.: System for bandwidth extension of Narrow-band speech and Frederik Nagel, Sascha Disch, “A harmonic bandwidth extension method for audio codecs,” ICASSP International Conference on Acoustics, Speech and Signal Processing, IEEE CNF, Taipei, Taiwan, April 2009.
These algorithms rely on a parametric representation of the high-frequency content (HF). This representation is generated from the low-frequency part (LF) of the decoded signal by means of transposition into the HF spectral region (“patching”) and application of a parameter driven post processing.
In the art, methods of bandwidth extension such as spectral band replication (SBR) or harmonic bandwidth extension (HBE) are known. In the following, these two BWE methods are briefly described.
On the one hand, spectral band replication (SBR), as described in M. Dietz, L. Liljeryd, K. Kjörling and O. Kunz, “Spectral Band Replication, a novel approach in audio coding,” in 112th AES Convention, Munich, May 2002, uses a quadrature mirror filterbank (QMF) for generating the HF information. Applying a so-called “patching” algorithm, lower QMF band signals are copied into higher QMF bands, leading to a replication of the information of the LF part in the HF part. Subsequently, the generated HF part is adapted to closely match the original HF part with the help of parameters that adjust the spectral envelope and the tonality.
On the other hand, harmonic bandwidth extension (HBE) is an alternative bandwidth extension scheme based on phase vocoders. HBE enables a harmonic continuation of the spectrum as opposed to SBR, which relies on a non-harmonic spectral shift. It may be utilized to replace or amend the SBR patching algorithm.
U.S. Provisional Patent Application with the application No. 61/079,841 discloses a BWE method, which may choose between alternative patching algorithms that operate either in frequency domain or in time domain. In the time-frequency transform by the filterbank, a certain predetermined analysis window is applied. Moreover, classic phase vocoder implementations according to the state-of-the-art use one predefined window shape such as a raised-cosine window or a Bartlett window.
However, choosing one predetermined analysis window for vocoder applications encompasses a trade-off to be made by the application designer in terms of overall perceptual audio quality achieved for different classes of audio signals. Thus, although the mean audio quality can be optimized by the initial choice of a certain window, the audio quality for each individual class of signals remains to be sub-optimal.
Moreover, it was found that certain signals benefit from using specialized analysis windows for a phase vocoder, which may especially be used for temporally spreading the audio signal without modifying the pitch of the same.
Therefore, a concept for selecting the optimal analysis windows such as within a BWE scheme is needed. However, measures against the just-mentioned degradation of the perceptional audio quality should advantageously not result in a significantly increased computational complexity of the employed codecs.