The present invention relates to audio signal processing and, in particular, to an apparatus and a method for generating a bandwidth extended signal from a bandwidth limited audio signal.
Storage or transmission of audio signals is often subject to strict bitrate constraints. In the past, coders were forced to drastically reduce the transmitted audio bandwidth when only a very low bitrate was available. Modern audio codecs are nowadays able to code wideband signals by using bandwidth extension (BWE) methods as described in M. Dietz, L. Liljeryd, K. Kjörling and O. Kunz, “Spectral Band Replication, a novel approach in audio coding,” in 112th AES Convention, Munich, May 2002; S. Meltzer, R. Böhm and F. Henn, “SBR enhanced audio codecs for digital broadcasting such as “Digital Radio Mondiale” (DRM),” in 112th AES Convention, Munich, May 2002; T. Ziegler, A. Ehret, P. Ekstrand and M. Lutzky, “Enhancing mp3 with SBR: Features and Capabilities of the new mp3PRO Algorithm,” in 112th AES Convention, Munich, May 2002; International Standard ISO/IEC 14496-3:2001/FPDAM 1, “Bandwidth Extension,” ISO/IEC, 2002. Speech bandwidth extension method and apparatus, Vasu lyengar et al; E. Larsen, R. M. Aarts, and M. Danessis. Efficient high-frequency bandwidth extension of music and speech. In AES 112th Convention, Munich, Germany, May 2002; R. M. Aarts, E. Larsen, and O. Ouweltjes. A unified approach to low- and high frequency bandwidth extension. In AES 115th Convention, New York, USA, October 2003; K. Käyhkö. A Robust Wideband Enhancement for Narrowband Speech Signal. Research Report, Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing, 2001; E. Larsen and R. M. Aarts. Audio Bandwidth Extension—Application to psychoacoustics, Signal Processing and Loudspeaker Design. John Wiley & Sons, Ltd, 2004; E. Larsen, R. M. Aarts, and M. Danessis. Efficient high-frequency bandwidth extension of music and speech. In AES 112th Convention, Munich, Germany, May 2002; J. Makhoul. Spectral Analysis of Speech by Linear Prediction. IEEE Transactions on Audio and Electroacoustics, AU-21(3), June 1973; U.S. patent application Ser. No. 08/951,029, Ohmori, et al., Audio band width extending system and method; and U.S. Pat. No. 6,895,375, Malah, D & Cox, R. V.: System for bandwidth extension of Narrow-band speech. These algorithms rely on a parametric representation of the high-frequency content (HF) which is generated from the low-frequency part (LF) of the decoded signal by means of transposition into the HF spectral region (“patching”) and application of a parameter driven post processing. The LF part is coded with any audio or speech coder. For example, the bandwidth extension methods described in M. Dietz, L. Liljeryd, K. Kjörling and O. Kunz, “Spectral Band Replication, a novel approach in audio coding,” in 112th AES Convention, Munich, May 2002; S. Meltzer, R. Böhm and F. Henn, “SBR enhanced audio codecs for digital broadcasting such as “Digital Radio Mondiale” (DRM),” in 112th AES Convention, Munich, May 2002; T. Ziegler, A. Ehret, P. Ekstrand and M. Lutzky, “Enhancing mp3 with SBR: Features and Capabilities of the new mp3PRO Algorithm,” in 112th AES Convention, Munich, May 2002; and International Standard ISO/IEC 14496-3:2001/FPDAM 1, “Bandwidth Extension,” ISO/IEC, 2002. Speech bandwidth extension method and apparatus, Vasu lyengar et al., rely on single sideband modulation (SSB), often also termed the “copy-up” method, for generating the multiple HF patches.
Lately, a new algorithm, which employs a bank of phase vocoders as described in M. Puckette. Phase-locked Vocoder. IEEE ASSP Conference on Applications of Signal Processing to Audio and Acoustics, Mohonk 1995.”, Röbel, A.: Transient detection and preservation in the phase vocoder; citeseer.ist.psu.edu/679246.html; Laroche L., Dotson M.: “Improved phase vocoder timescale modification of audio”, IEEE Trans. Speech and Audio Processing, vol. 7, no. 3, pp. 323-332; U.S. Pat. No. 6,549,884, Laroche, J. & Dotson, M.: Phase-vocoder pitch-shifting, for the generation of the different patches, has been presented as described in Frederik Nagel, Sascha Disch, “A harmonic bandwidth extension method for audio codecs,” ICASSP International Conference on Acoustics, Speech and Signal Processing, IEEE CNF, Taipei, Taiwan, April 2009. This method has been developed to avoid the auditory roughness which is often observed in signals subjected to SSB bandwidth extension. Albeit being beneficial for many tonal signals, this method called “harmonic bandwidth extension” (HBE) is prone to quality degradations of transients contained in the audio signal as described in Frederik Nagel, Sascha Disch, Nikolaus Rettelbach, “A phase vocoder driven bandwidth extension method with novel transient handling for audio codecs,” 126th AES Convention, Munich, Germany, May 2009, since vertical coherence over sub-bands is not guaranteed to be preserved in the standard phase vocoder algorithm and, moreover, the re-calculation of the phases has to be performed on time blocks of a transform or, alternatively of a filterbank. Therefore, a need arises for a special treatment for signal parts containing transients. Additionally, the overlap add based phase vocoders applied in the HBE algorithm cause additional delay which is too high to be acceptable for use in applications designed for communication purposes.
As outlined above, existing bandwidth extension schemes may apply one patching method on a given signal block at a time, be it SSB based patching as described in M. Dietz, L. Liljeryd, K. Kjörling and O. Kunz, “Spectral Band Replication, a novel approach in audio coding,” in 112th AES Convention, Munich, May 2002; S. Meltzer, R. Böhm and F. Henn, “SBR enhanced audio codecs for digital broadcasting such as “Digital Radio Mondiale” (DRM),” in 112th AES Convention, Munich, May 2002; T. Ziegler, A. Ehret, P. Ekstrand and M. Lutzky, “Enhancing mp3 with SBR: Features and Capabilities of the new mp3PRO Algorithm,” in 112th AES Convention, Munich, May 2002; and International Standard ISO/IEC 14496-3:2001/FPDAM 1, “Bandwidth Extension,” ISO/IEC, 2002. Speech bandwidth extension method and apparatus, Vasu lyengar et al., or HBE vocoder based patching explained in Frederik Nagel, Sascha Disch, “A harmonic bandwidth extension method for audio codecs,” in ICASSP International Conference on Acoustics, Speech and Signal Processing, IEEE CNF, Taipei, Taiwan, April 2009. based on phase vocoder techniques as described in M. Puckette. Phase-locked Vocoder. IEEE ASSP Conference on Applications of Signal Processing to Audio and Acoustics, Mohonk 1995.”, Röbel, A.: Transient detection and preservation in the phase vocoder; citeseer.ist.psu.edu/679246.html; Laroche L., Dotson M.: “Improved phase vocoder timescale modification of audio”, IEEE Trans. Speech and Audio Processing, vol. 7, no. 3, pp. 323-332; U.S. Pat. No. 6,549,884, Laroche, J. & Dotson, M.: Phase-vocoder pitch-shifting.
Alternatively, a combination of HBE and SSB based patching can be used as described in U.S. Provisional 61/312,127. Additionally, modern audio coders as described in Neuendorf, Max; Gournay, Philippe; Multrus, Markus; Lecomte, Jérémie; Bessette, Bruno; Geiger, Ralf; Bayer, Stefan; Fuchs, Guillaume; Hilpert, Johannes; Rettelbach, Nikolaus; Salami, Redwan; Schuller, Gerald; Lefebvre, Roch; Grill, Bernhard: Unified Speech and Audio Coding Scheme for High Quality at Lowbitrates, ICASSP 2009, Apr. 19-24, 2009, Taipei, Taiwan; Bayer, Stefan; Bessette, Bruno; Fuchs, Guillaume; Geiger, Ralf; Gournay, Philippe; Grill, Bernhard; Hilpert, Johannes; Lecomte, Jérémie; Lefebvre, Roch; Multrus, Markus; Nagel, Frederik; Neuendorf, Max; Rettelbach, Nikolaus; Robilliard, Julien; Salami, Redwan; Schuller, Gerald: A Novel Scheme for Low Bitrate Unified Speech and Audio Coding, 126th AES Convention, May 7, 2009, Munich, offer the possibility of switching the patching method globally on a time block basis between alternative patching schemes.
Conventional SSB copy-up patching has a disadvantage that it introduces unwanted roughness into the audio signal. However, it is computationally simple and preserves the time envelope of transients.
In audio codecs employing HBE patching, a disadvantage is that the transient reproduction quality is often suboptimal. Moreover, the computational complexity is significantly increased over the computational very simple SSB copy-up method.
Additionally, HBE patching introduces additional algorithmic delay which exceeds the acceptable range for application in communication scenarios.
A further disadvantage of the state-of-the-art processing is that the combination of HBE and SSB based patching within one time block does not eliminate the additional delay caused by HBE.
It is an object of the present invention to provide a concept for generating a bandwidth extended signal from a bandwidth limited audio signal allowing an improved perceptual quality avoiding such disadvantages.