Audio signals, like speech or music, are encoded for example for enabling an efficient transmission or storage of the audio signals.
Audio encoders and decoders are used to represent audio based signals, such as music and background noise. These types of coders typically do not utilise a speech model for the coding process, rather they use processes for representing all types of audio signals, including speech.
Speech encoders and decoders (codecs) are usually optimised for speech signals, and can operate at either a fixed or variable bit rate.
An audio codec can also be configured to operate with varying bit rates. At lower bit rates, such an audio codec may work with speech signals at a coding rate equivalent to a pure speech codec. At higher bit rates, the audio codec may code any signal including music, background noise and speech, with higher quality and performance.
In some audio codecs the input signal is divided into a limited number of bands. Each of the band signals may be quantized. From the theory of psychoacoustics it is known that the highest frequencies in the spectrum are perceptually less important than the low frequencies. This in some audio codecs is reflected by a bit allocation where fewer bits are allocated to high frequency signals than low frequency signals.
Furthermore in some codecs use the correlation between the low and high frequency bands or regions of an audio signal to improve the coding efficiency with the codecs.
As typically the higher frequency bands of the spectrum are generally quite similar to the lower frequency bands some codecs may encode only the lower frequency bands and reproduce the upper frequency bands as a scaled lower frequency band copy. Thus by only using a small amount of additional control information considerable savings can be achieved in the total bit rate of the codec.
One such codec for coding the high frequency region is known as higher frequency region (HFR) coding. One form of higher frequency region coding is spectral-band-replication (SBR), which has been developed by Coding Technologies. In SBR, a known audio coder, such as Moving Pictures Expert Group MPEG-4 Advanced Audio Coding (AAC) or MPEG-1 Layer III (MP3) coder, codes the low frequency region. The higher frequency region is generated separately utilizing the coded low frequency region.
In SBR coding, the higher frequency region is obtained by transposing the lower frequency region to the higher frequencies. The transposition is based on a Quadrature Mirror Filters (QMF) filter bank with 32 bands and is performed such that it is predefined from which band samples each high frequency band sample is constructed. This is done independently of the characteristics of the input signal.
The higher frequency bands are modified based on additional information. The filtering is done to make particular features of the synthesized high frequency region more similar with the original one. Additional components, such as sinusoids or noise, are added to the high frequency region to increase the similarity with the original high frequency region. Finally, the envelope is adjusted to follow the envelope of the original high frequency spectrum.
Higher frequency region coding however does not produce an identical copy of the original high frequency region. Specifically, the known higher frequency region coding mechanisms perform relatively poorly where the input signal is tonal, in other words does not have a spectrum similar to that of noise.