The present invention relates to bandwidth extension (BWE) of audio signals. BWE schemes are increasingly used in speech and audio coding/decoding to improve the perceived quality at a given bitrate. The main idea behind BWE is that part of an audio signal is not transmitted, but reconstructed (estimated) at the decoder from the received signal components.
Thus, in a BWE scheme a part of the signal spectrum is reconstructed in the decoder. The reconstruction is performed using certain features of the signal spectrum that has actually been transmitted using traditional coding methods. Typically the signal high band (HB) is reconstructed from certain low band (LB) audio signal features.
Dependencies between LB features and HB signal characteristics are often modeled by Gaussian mixture models (GMM) or hidden Markov models (HMM), e.g., [1-2]. The most often predicted HB characteristics are related to spectral and/or temporal envelopes.
There are two major types of BWE approaches:                In a first approach, HB signal characteristics are entirely predicted from certain LB features. These BWE solutions introduce artifacts in the reconstructed HB, which in some cases lead to decreased quality in comparison to the band-limited signal. The sophisticated mappings (e.g., based on GMM or HMM) easily lead to degradation with unknown data. The general experience is that the more complex the mapping (large number of training parameters), the more likely artifacts will occur with data types not present in the training set. It is not trivial to find a mapping with complexity that will give an optimal balance between overall prediction accuracy and low number of outliers (data that deviate markedly from data in the training set, i.e. components which can not be very well modeled).        A second approach (an example is described in [3]) is to reconstruct the HB signal from a combination of LB features and a small amount of transmitted HB information. BWE schemes with transmitted HB information tend to improve the performance (at the cost of an increased bit-budget), but do not offer a general scheme to combine transmitted and predicted parameters. Typically one set of HB parameters are transmitted and another set of HB parameters are predicted, which means that transmitted information cannot compensate for failures in predicted parameters.        