In the recent years, an increasing demand for an encoding and decoding of audio content has developed. While the available bitrates and storage capacities for transmission and storage of encoded audio contents have substantially increased, there is still a demand for a bitrate efficient encoding, transmission, storage and decoding of audio contents at reasonable quality, especially of speech signals in communication scenarios.
Contemporary speech coding systems are capable of encoding wideband (WB) digital audio content, that is, signals with frequencies of up to 7-8 kHz, at bitrates as low as 6 kbps. The most widely discussed examples are the ITU-T recommendations G.722.2 (cf., for example, reference [1]) as well as the more recently developed G.718 (cf., for example, references [4] and [10]) and MPEG unified speech and audio codec xHE-AAC (cf., for example, reference [8]). Both G.722.2, also known as AMR-WB, and G.718 employ bandwidth extension (BWE) techniques between 6.4 and 7 kHz to allow the underlying ACELP core-coder to “focus” on the perceptually more relevant lower frequencies (particularly the ones at which the human auditory system is phase-sensitive), and thereby achieve sufficient quality, especially at very low bitrates. In xHE-AAC, enhanced spectral band replication (eSBR) is used for bandwidth extension (BWE). The bandwidth extension process can generally be divided into two conceptual approaches:                “blind” or “artificial” BWE, in which high-frequency (HF) components are reconstructed from the decoded low-frequency (LF) core-coder signal alone, i.e. without necessitating side-information transmitted from the encoder. This scheme is used by AMR-WB and G.718 at 16 kbps and below, as well as some backward-compatible bandwidth extension post-processing systems operating on traditional narrowband telephonic speech (cf., for example, references [5] and [9]).        “guided” BWE, which differs from blind bandwidth extension in that some of the parameters used for high-frequency (HF) content reconstruction are transmitted to the decoder as side information instead of being estimated from the decoded core signal. AMR-WB, G.718, xHE-AAC as well as some other codecs (cf., for example, references [2], [7] and [11]) use this approach, but not at very low bitrates.        
However, it has been found that it is difficult to provide appropriate bandwidth extension at low bitrates which provides for a sufficiently good quality in the reconstruction of the audio content.
Thus, there is a need for a bandwidth extension concept which brings along an improved tradeoff between bitrate and audio quality.