Many modern communication systems transmit speech and audio signals in frames, meaning that the sending side first arranges the signal in short segments or frames of e.g. 20-40 ms which subsequently are encoded and transmitted as a logical unit in e.g. a transmission packet. The receiver decodes each of these units and reconstructs the corresponding signal frames, which in turn are finally output as continuous sequence of reconstructed signal samples. Prior to encoding there is usually an analog to digital (A/D) conversion that converts the analog speech or audio signal from a microphone into a sequence of audio samples. Conversely, at the receiving end, there is typically a final digital to analog (D/A) conversion that converts the sequence of reconstructed digital signal samples into a time continuous analog signal for loudspeaker playback.
Almost any such transmission system for speech and audio signals may however suffer from transmission errors. This may lead to the situation that one or several of the transmitted frames are not available at the receiver for reconstruction. In that case, the decoder has to generate a substitution signal for each of the erased, i.e. unavailable frames. This is done in the so-called frame loss or error concealment unit of the receiver-side signal decoder. The purpose of the frame loss concealment is to make the frame loss as inaudible as possible and hence to mitigate the impact of the frame loss on the reconstructed signal quality as much as possible.
One recent frame loss concealment method for audio is the so-called ‘Phase ECU’. This is a method that provides particularly high quality of the restored audio signal after packet or frame loss in case the signal is a music signal. There is also a controlling method disclosed in a previous application that controls the behavior of a frame loss concealment method of Phase-ECU type in response to for instance (statistical) properties of frame losses.
Burstiness of the frame losses is used as one indicator in the controlling method in which response a frame loss concealment method like Phase ECU can be adapted. In general terms, burstiness of frame losses means that there occur several frame losses in a row, making it hard for the frame loss concealment method to use valid recently decoded signal portions for its operation. More specifically, a typical state-of-the art frame loss burstiness indicator is the number n of observed consecutive frame losses. This number can be maintained in a counter which is incremented by one upon each new frame loss and reset to zero upon the reception of a valid frame.
A specific adaptation method of a frame loss concealment method like Phase ECU in response to frame loss burstiness is frequency-selective adjustment of the phases or the spectrum magnitudes of a substitution frame spectrum Z(m), m being a frequency index of a frequency domain transform like the Discrete Fourier Transform (DFT). The magnitude adaptation is done with an attenuation factor α(m) that scales the frequency transform coefficient at index m with increasing frame loss burst counter, n, down to 0. The phase adaptation is done through increasing additive randomization of the phase (with an increasing random phase component θ(m)) of the frequency transform coefficient at index m.
Hence, if the original substitution frame spectrum of the Phase ECU follows an expression like Z(m)=Y(m)·ejθk, then the adapted substitution frame spectrum follows an expression like Z(m)=α(m)·Y(m)·ej(θk+θ(m)).
Herein phase θk with k=1 . . . K is a function of index m and the K spectral peaks identified by the Phase ECU method, and Y(m) is a frequency domain representation (spectrum) of a frame of the previously received audio signal.
Despite the advantages of the above-described adaptation method of the Phase ECU in conditions of burst frame loss, there are still quality shortcomings in case of very long loss burst, e.g. when n greater or equal to 5. In that case the quality of the reconstructed audio signal may e.g. suffer from tonal artifacts, despite the performed phase randomization. At the same time the increasing magnitude attenuation may reduce these audible shortcomings. However, the attenuation of the signal may for long frame loss bursts be perceived as muting or signal drop outs. This may again affect the overall quality of e.g. music or the ambient noise of a speech signal since such signals are sensitive to too strong level variations.
Hence, there is still a need for improved frame loss concealment.