Embodiments according to the invention create error concealment units for providing an error concealment audio information for concealing a loss of an audio frame or more audio frames in an encoded audio information.
Embodiments according to the invention create audio decoders for providing a decoded audio information on the basis of an encoded audio information, the decoders comprising error concealment units.
Some embodiments according to the invention create methods for providing an error concealment audio information for concealing a loss of an audio frame in an encoded audio information.
Some embodiments according to the invention create computer programs for performing one of said methods.
Some embodiments are related to a usage of an adaptive damping factor for frequency domain audio codecs.
In recent years there is an increasing demand for a digital transmission and storage of audio contents. However, audio contents are often transmitted over unreliable channels, which brings along the risk that data units (for example, packets) comprising one or more audio frames (for example, in the form of an encoded representation, like, for example, an encoded frequency domain representation or an encoded time domain representation) are lost. In some situations, it would be possible to request a repetition (resending) of lost audio frames (or of data units, like packets, comprising one or more lost audio frames). However, this would typically bring a substantial delay, and would therefore entail an extensive buffering of audio frames. In other cases, it is hardly possible to request a repetition of lost audio frames.
In order to obtain a good, or at least acceptable, audio quality given the case that audio frames are lost without providing extensive buffering (which would consume a large amount of memory and which would also substantially degrade real time capabilities of the audio coding) it is desirable to have concepts to deal with a loss of one or more audio frames. In particular, it is desirable to have concepts which bring along a good audio quality, or at least an acceptable audio quality, even in the case that audio frames are lost.
In the past, some error concealment concepts have been developed, which can be employed in different audio coding concepts. A conventional concealment technique in advanced audio codec (AAC) is noise substitution. It operates in the frequency domain and is suited for noisy and music items.
Fade out techniques have also been developed for reduce the intensity of the substituting frames (or spectral values). These techniques are often based on scaling the substituting frame by a predetermined coefficient (damping factor). Normally, the damping factor is represented as a value between 0 and 1: the lower the damping factor, the stronger the fade out.
In case of packet losses, speech and audio codecs usually fades towards zero or background noise to prevent annoying repetition artefacts. In G.719 [1] for example, the synthesized signal are decreasingly scaled with a factor 0.5 and then used as the reconstructed transform coefficients for the current frame. For all AAC family decoders like [2], the concealed spectrum is faded out with a constant damping factor equal to √{square root over (0.5)}≅0.7071, when no additional delay is allowed. This damping factor is applied on the complete spectrum regardless on the signal characteristics.
However, especially for speech or transient signals, such a fade out technique is not completely satisfactory. When the first lost frame is right after a word end, the noise substitution will imply the repetition of the previous properly decoded audio frame, i.e. the frame in which the word is ended: a useless part of speech (carrying no information) will be repeated, implying annoying post echoes. See, for example, FIG. 10 (with echo) in comparison with FIG. 11 (where no echo is present). FIGS. 10 and 11 represent frequency in ordinate and time in abscissa (in hundred ms or hms).
This echo is a direct, unavoidable consequence of the repetition of the properly decoded audio frame.
It would be of advantage to overcome such a technical impairment. G.729.1 [3] and EVS [4] propose adaptive fade out techniques, which depend on the stability of the signal characteristics. A fade out factor depends on the parameters of the last good received superframe class and the number of consecutive erased superframes. The factor is further dependent on the stability of the LP filter for UNVOICED superframes (a classification between VOICED and UNVOICED frames being carried out). As there is no signal characteristics available in AAC decoders like AAC-ELD [5], the codec is damping the concealed signal blindly with a fix factor, which can leads to the annoying repetition artefacts discussed above.
In some conditions it has been found that annoying artefacts can be generated by holes in the spectral representation.