The signal has the form of a succession of samples, broken into successive frames and “frame” is understood to mean a signal segment composed of several samples (an implementation where one frame comprises one single sample is possible if the signal has the form of a succession of samples, as in for example the codecs according to the ITU-T G.711 recommendation).
The invention is in the digital signal processing field, in particular but not exclusively, in the field of coding/decoding an audio signal. Frame losses occur when communication (either by real-time transmission, or by storage for subsequent transmission) using a coder and a decoder is disrupted by channel conditions (e.g. because of radio problems, access network congestion, etc.).
In this case, the decoder uses frame loss correction (or “concealment”) mechanisms in order to attempt to substitute a reconstructed signal for the missing signal by using information available within the decoder (for example, the already decoded signal or parameters received in preceding frames). With this technique, good service quality can be maintained despite degraded channel performance.
Frame loss correction techniques are most often very dependent on the type of coding use.
In the case of the coding of a speech signal based on CELP (“Code Excited Linear Prediction”) type technologies, the frame loss correction makes use in particular of the CELP model. For example, in a coding according to the ITU-T G.722.2 recommendation, the solution for replacing a lost frame (or a “packet”) consists of extending the use of a long-term gain prediction by the attenuator and also extending the use of each ISF (“Immittance Spectral Frequency”) parameter by making them tend towards their respective averages. The pitch of the speech signal (parameter designated “LTP lag”) is also repeated. Additionally, random values for parameters characterizing the “innovation” (the excitation in the CELP coding) are supplied to the decoder.
It should be noted already that the application of this type of method for transform coding or PCM or ADPCM type waveform coding requires a CELP type parametric analysis in the decoder of the signal passed which introduces an additional complexity.
In the ITU-T G.711 recommendation corresponding to a waveform coder, an informative example of frame loss correction processing (given in Appendix I of the text of this recommendation) consists of finding a pitch period in the already decoded speech signal and repeating the last pitch period by recovery-addition (“overlap-add”) between the already decoded signal and the repeated signal (reconstructed by concealment). With this processing, the audio artifacts can be “smoothed” but require an additional delay in the decoder (delay corresponding to the recovery time).
The most used technique for replacing frame loss in the case of coding by transformation consists of repeating the spectrum decoded in the last frame received. For example, in the case of coding according to the ITU-T G.722.1 recommendation, the MLT (“modified lapped transform”) transform, equivalent to a modified discrete cosine transform (MDCT) with a 50% recovery and sinusoidal shaped analysis/synthesis windows, serves to provide a sufficiently slow transition between the last lost frame and the repeated frame for smoothing the artifacts related to the simple repetition of the spectrum; typically, the repeated spectrum is set to zero if more than one frame is a lost.
Advantageously, this concealment method does not require additional delay because it makes use of the recovery-addition between the reconstructed signal and the past signal in order to make a sort of “crossfade” (with temporal aliasing due to the MLT transform). It represents a technique with very low resource cost.
However, it has a defect related to the temporal inconsistency between the signal right before the loss of frame and the repeated signal. The result of this is a phase discontinuity (or inconsistency) which can produce significant audio artifacts if the recovery time between the signals associated with two frames is reduced (as is the case in particular when MDCT frames referred to as “short delay” are used). The short-term recovery situation is illustrated in FIG. 1B in the case of a short delay MLT transform, in comparison with the usual situation from FIG. 1A in which long sine windows are used according to the G.722.1 recommendation (thus providing a long recovery time ZRA with a very progressive modulation). It appears that a modulation by a short delay window produces a phase offset which is which is audible because of the short recovery zone ZRB, as shown in FIG. 1B.
In this case, even though a solution combining a pitch search (case of decoding according to recommendation G.711 Appendix I) and a recovery-addition produced by the window of an MDCT transform would be implemented, it would not be sufficient for eliminating the audio artifacts related in particular to the phase shift between the frequency components.