The present invention relates to audio processing, and specifically to a device and method and computer program for combined blind and guided bandwidth extension.
Storage or transmission of audio signals is often subject to strict bitrate constraints. In the past, coders were forced to drastically reduce the transmitted audio bandwidth when only a very low bitrate was available. Modern audio codecs are nowadays able to code wideband signals by using bandwidth extension (BWE) methods. These algorithms rely on a parametric representation of the high-frequency content (HF)—which is generated from the waveform coded low-frequency part (LF) of the decoded signal by means of transposition into the I-IF spectral region (“patching”) and application of a parameter driven post processing.
The post processing includes the adaptation of energy levels to target the energy distribution of the original signal (also known as. envelope shaping) but also the adaptation of the perceived tonality in the transposed HF bands with the help of band selective inverse filtering (decreasing tonality), addition of a synthetic noise floor (decreasing tonality) or addition of individual sinusoids (increasing tonality).
The BWE exploits the correlation between LF and HF and aims at generating HF information which is as similar to original HF content as possible. Such a BWE extends the frequency up to a certain highest frequency Fmax. The decision of highest frequency thereby depends on a trade-off of quality and bitrate.
U.S. Pat. No. 6,680,972 B1 discloses a source coding enhancement technique using spectral band replication. Bandwidth reduction prior to or in the encoder is followed by spectral band replication at the decoder. This is accomplished by the use of transposition methods in combination with spectral envelope adjustments. A reduced bitrate at a given perceptual quality or an improved perceptual quality at a given bitrate is obtained.
A related technology is included in the MPEG-4 standard (ISO/IEC 14496-3: 2005(E)). Particularly, section 4.6.18 of this standard comprises the spectral band replication (SBR) tool. This tool extends the audio bandwidth of the decoded bandwidth-limited audio signal. This process is based on replication of the sequences of harmonics, previously truncated in order to reduce data rate from the available bandwidth limited signal and control data obtained from the encoder. The ratio between tonal and noise-like components is maintained by adaptive inverse filtering as well as an addition of noise and sinusoidals. The control data obtained from the encoder comprise spectral envelope adjustment data for adjusting the spectral envelope of the patched signal and, additionally, inverse filtering data for setting the ratio between tonal and noise-like components, information on noise to be added to the patched signal and information on missing harmonics to be added to the patched signal within an SBR operation for generating a wideband signal.
This standardized procedure only performs a guided bandwidth extension, since the maximum frequency up to which a wideband signal is generated is also reflected by the parametric data attached to the lowband high resolution signal. Hence, for improving the quality of the audio signal by generating a higher bandwidth signal, additional parametric data is necessitated which additionally enhances the bitrate of the transmitted data. On the other hand, when the bitrate is to be reduced for transmission channel capacity reasons, then one might cut parametric data for the highest or some of the highest bands of the replicated signal at the encoder. This automatically results in a reduction of the audio quality, since an SBR decoder will only generate a high frequency portion up to a frequency, i.e. up to a certain band, for which parametric data is included in the incoming data or bitstream. Hence, reducing the bitrate results in a reduction of the audio quality or an enhancement of the audio quality results in an increase of the bitrate.