The present invention relates to the field of encoding/decoding in telecommunications, and more particularly to the field of frame loss correction in decoding.
A “frame” is an audio segment composed of at least one sample (the invention applies to the loss of one or more samples in coding according to G.711 as well as to a loss one or more packets of samples in coding according to standards G.723, G.729, etc.).
Losses of audio frames occur when a real-time communication using an encoder and a decoder is disrupted by the conditions of a telecommunications network (radiofrequency problems, congestion of the access network, etc.). In this case, the decoder uses frame loss correction mechanisms to attempt to replace the missing signal with a signal reconstructed using information available at the decoder (for example the audio signal already decoded for one or more past frames). This technique can maintain a quality of service despite degraded network performance.
Frame loss correction techniques are often highly dependent on the type of coding used.
In the case of CELP coding, it is common to repeat certain parameters decoded in the previous frame (spectral envelope, pitch, gains from codebooks), with adjustments such as modifying the spectral envelope to converge toward an average envelope or using a random fixed codebook.
In the case of transform coding, the most widely used technique for correcting frame loss consists of repeating the last frame received if a frame is lost and setting the repeated frame to zero as soon as more than one frame is lost. This technique is found in many coding standards (G.719, G.722.1, G.722.1C). One can also cite the case of the G.711 coding standard, for which an example of frame loss correction described in Appendix I to G.711 identifies a fundamental period (called the “pitch period”) in the already decoded signal and repeats it, overlapping and adding the already decoded signal and the repeated signal (“overlap-add”). Such overlap-add “erases” audio artifacts, but in order to be implemented requires an additional delay in the decoder (corresponding to the duration of the overlap).
Moreover, in the case of coding standard G.722.1, a modulated lapped transform (or MLT) with an overlap-add of 50% and sinusoidal windows ensures a transition between the last lost frame and the repeated frame that is slow enough to erase artifacts related to simple repetition of the frame in the case of a single lost frame. Unlike the frame loss correction described in the G.711 standard (Appendix I), this embodiment requires no additional delay because it makes use of the existing delay and the temporal aliasing of the MLT transform to implement an overlap-add with the reconstructed signal.
This technique is inexpensive, but its main fault is an inconsistency between the signal decoded before the frame loss and the repeated signal. This results in a phase discontinuity that can produce significant audio artifacts if the duration of the overlap between the two frames is low, as is the case when the windows used for the MLT transform are “short delay” as described in document FR 1350845 with reference to FIGS. 1A and 1B of that document. In such case, even a solution combining a pitch search as in the case of the coder according to standard G.711 (Appendix I) and an overlap-add using the window of the MLT transform is not sufficient to eliminate audio artifacts.
Document FR 1350845 proposes a hybrid method that combines the advantages of both these methods to keep phase continuity in the transformed domain. The present invention is defined within this framework. A detailed description of the solution proposed in FR 1350845 is described below with reference to FIG. 1.
Although it is particularly promising, this solution requires improvement because, when the encoded signal has only one fundamental period (“mono pitch”), for example in a voiced segment of a speech signal, the audio quality after frame loss correction may be degraded and not as good as with frame loss correction by a speech model of a type such as CELP (“Code-Excited Linear Prediction”).