The present invention relates to audio signal processing, and in particular, to an apparatus and a method for generating a synthesis audio signal, an apparatus and a method for encoding an audio signal and an encoded audio signal.
Storage or transmission of audio signals is often subject to strict bit rate constraints. These constraints are usually overcome by an intermediate coding of the signal. In the past, coders were forced to drastically reduce the transmitted audio bandwidth when only a very low bit rate was available. Modern audio codecs are able to code wide-band signals by using bandwidth extension (BWE) methods, as described in M Dietz, L. Liljeryd, K. Kjörling and O. Kunz, “Spectral Band Replication, a novel approach in audio coding” in 112th AES Convention, Munich, May 2002; S. Meltzer, R. Böhm and F. Henn, “SBR enhanced audio codecs for digital broadcasting such as “Digital Radio Mondiale” (DRM),” in 112th AES Convention, Munich, May 2002; T. Ziegler, A. Ehret, P. Ekstrand and M. Lutzky, “Enhancing mp3 with SBR: Features and Capabilities of the new mp3PRO Algorithm,” in 112th AES Convention, Munich, May 2002; International Standard ISO/IEC 14496-3:2001/FPDAM 1, “Bandwidth Extension,” ISO/IEC, 2002. Speech bandwidth extension method and apparatus Vasu Iyengar et al. U.S. Pat. No. 5,455,888; E. Larsen, R. M. Aarts, and M. Danessis. Efficient high-frequency bandwidth extension of music and speech. In AES 112th Convention, Munich, Germany, May 2002; R. M. Aarts, E. Larsen, and O. Ouweltjes. A unified approach to low- and high frequency bandwidth extension. In AES 115th Convention, New York, USA, October 2003; K. Käyhkö. A Robust Wideband Enhancement for Narrowband Speech Signal. Research Report, Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing, 2001; E. Larsen and R. M. Aarts. Audio Bandwidth Extension—Application to psychoacoustics, Signal Processing and Loudspeaker Design. John Wiley & Sons, Ltd, 2004; E. Larsen, R. M. Aarts, and M. Danessis. Efficient high-frequency bandwidth extension of music and speech. In AES 112th Convention, Munich, Germany, May 2002; J. Makhoul. Spectral Analysis of Speech by Linear Prediction. IEEE Transactions of Audio and Electroacoustics, AU-21(3), June 1973; U.S. patent application Ser. No. 08/951,029, Ohmori, et al. Audio band width extending system and method; U.S. Pat. No. 6,895,375, Malah, D & Cox, R. V.: System for bandwidth extension of Narrow-band speech, and Frederik Nagel, Sascha Disch, “A harmonic bandwidth extension method for audio codecs,” ICASSP International Conference on Acoustics, Speech and Signal Processing, IEEE CNF, Taipei, Taiwan, April 2009.
These algorithms rely on a parametric representation of the high-frequency content (HF). This representation is generated from the low-frequency part (LF) of the decoded signal by means of transposition into the HF spectral region (“patching”) and application of a parameter driven post processing.
In the art, methods of bandwidth extension such as spectral band replication (SBR) are used as an efficient method to generate high frequency signals in an HFR (high frequency reconstruction) based codec.
The spectral band replication (SBR), as described in M Dietz, L. Liljeryd, K. Kjörling and O. Kunz, “Spectral Band Replication, a novel approach in audio coding” in 112th AES Convention, Munich, May 2002, uses a quadrature mirror filterbank (QMF) for generating the HF-information. With the so-called “patching”, lower QMF band signals are copied into higher QMF bands, leading to a replication of the information of the LF part in the HF part. The generated HF part is afterwards adapted to the original HF part with the help of parameters that adjust the spectral envelope and the tonality.
In SBR, as standardized in HE-AAC, all operations, which include the patching by means of simply copying, are always carried out inside the QMF-domain. However, other different patching methods can be carried out in different domains such as the FFT domain or the time domain. One might imagine to enabling SBR to alternatively choose a patching algorithm which operates either in the FFT domain or in the time domain, and needs an additional transformation for feeding the QMF analysis step.
In plain SBR, only one patching algorithm is available that takes into account neither needs of special hard- or software nor signal characteristics. Hence, SBR is not able to adapt the patching algorithm. One might imagine to simply choose between two distinct patching algorithms. Since the two patching methods work in different domains, the transition areas are prone to produce blocking artifacts, which makes fine-grain switching between both methods practically impossible.
WO 98/57436 discloses transposition methods used in spectral band replication, which are combined with spectral envelope adjustment.
WO 02/052545 teaches that signals can be classified either in pulse-train-like or non-pulse-train-like and based on this classification an adaptive switch transposer is proposed. The switch transposer performs two patching algorithms in parallel and the mixing unit combines both patched signals dependent on the classification (pulse-train or non-pulse-train). The actual switching between or mixing of the transposers is performed in an envelope-adjusting filterbank in response to envelope and control data. Furthermore, for pulse-train-like signals, the base signal is transformed into a filterbank domain, a frequency translating operation performed and an envelope adjustment of the result of the frequency translation is performed. This is a combined patching/further processing procedure. For non-pulse-train-like signals, a frequency domain transposer (FD transposer) is provided and the result of the frequency domain transposer is then transformed into the filterbank domain, in which the envelope adjustment is performed. Thus, implementation and flexibility of this procedure, which has in one alternative, a combined patching/further processing approach, and which has in the other alternative, the frequency domain transposer, which is positioned outside of the filterbank in which the envelope adjustment takes place is problematic with respect to flexibility and implementation possibilities.