Telephonic speech over mobile telephones has usually utilized only a portion of the audible sound spectrum, for example, narrow-band speech within the 300 to 3400 Hz audio spectrum. Compared to normal speech, such narrow-band speech has a muffled quality and reduced intelligibility. Therefore, various methods of extending the bandwidth of the output of speech coders, referred to as “bandwidth extension” or “BWE,” may be applied to artificially improve the perceived sound quality of the coder output.
Although BWE schemes may be parametric or non-parametric, most known BWE schemes are parametric. The parameters arise from the source-filter model of speech production where the speech signal is considered as an excitation source signal that has been acoustically filtered by the vocal tract. The vocal tract may be modeled by an all-pole filter, for example, using linear prediction (LP) techniques to compute the filter coefficients. The LP coefficients effectively parameterize the speech spectral envelope information. Other parametric methods utilize line spectral frequencies (LSF), mel-frequency cepstral coefficients (MFCC), and log-spectral envelope samples (LES) to model the speech spectral envelope.
Many current speech/audio coders utilize the Modified Discrete Cosine Transform (MDCT) representation of the input signal and therefore BWE methods are needed that could be applied to MDCT based speech/audio coders.