In many contemporary communication systems and especially in mobile communication systems there is only limited transmission bandwidth available for real time audio transmissions, such as speech or music transmissions for example. In order to transmit as many audio channels as possible over a transmission link with restricted bandwidth, such as a radio network for example, there is therefore frequently provision for compressing the audio signals to be transmitted by using real time or quasi real time audio encoding methods and for decompressing them after transmission In this document the term audio is especially also understood to mean speech.
With these types of audio encoding method the aim is generally to reduce the volume of data to be transmitted and thereby the transmission rate as much as possible without adversely effecting the subjective listening impression or with voice transmissions without adversely effecting comprehensibility.
An efficient compression of audio signals is also a significant factor in connection with storage or archiving of audio signals.
Encoding methods have proved to be especially efficient in which an audio signal synthesized by an audio synthesis filter is compared frame by frame over time with an audio signal to be transmitted by optimization of filter parameters. Such a method of operation is frequently referred to as analysis-by-synthesis. The audio synthesis filter is in this case excited by an excitation signal that is preferably likewise to be optimized. The filtering is frequently also referred to as formant synthesis. So-called LPC coefficients (LPC: Linear Predictive Coding) and/or parameters that specify a spectral and or temporal enveloping of the audio signal can be used as filter parameters for example. The optimized filter parameters as well as the parameters specifying the excitation signal will then be transmitted in time frames to the receiver in order to form a synthetic audio signal there by means of an audio signal decoder provided on the receive-side which is as similar as possible to the original audio signal in respect of subjective audio impression.
Such an audio encoding method is known from ITU-T recommendation G.729. By means of the audio encoding method described therein a real time audio signal with a bandwidth of 4 kHz can be reduced to a transmission rate of 8 kbit/s.
In addition efforts are currently being made to synthesize an audio signal to be transmitted using a higher bandwidth in order to improve the audio impression. In the expansion G.729EV of the G.792 recommendation currently under discussion an attempt is being made to expand the audio bandwidth from 4 kHz to 8 kHz.
The transmission bandwidth and audio synthesis quality able to be achieved largely depend on the creation of a suitable excitation signal.
In the case of a bandwidth expansion for which an excitation signal unb(k) in a low subband, e.g. in the frequency range of 50 Hz to 3.4 kHz, already exists, a bandwidth-expanding excitation signal unb(k) can be formed in a high subband, e.g. in the frequency range from 3.4-7 kHz, as a spectral copy of the narrowband excitation signal unb(k). (The index k is to be taken here and below to be an index of sampling values of the excitation signal or other signals). The copy can be formed in such cases by spectral translation or by spectral mirroring of the narrowband excitation signal unb(k). However the spectrum of the excitation signal is anharmonically distorted and/or a significant audible phase error is caused in the spectrum by such spectral translation or mirroring. This leads however to an audible loss of quality of the audio signal.