ADPCM technology is a technique for compressing multimedia signals. The best known and most widely used examples of coders using ADPCM technology are two speech coders, standardized at the ITU-T: the ITU-T G.726 coder for signals in the telephone band (or narrow band) sampled at 8 kHz (“Digital Enhanced Cordless Telephone” or DECT coder) and the ITU-T G.722 coder for signals in the broadened band sampled at 16 kHz (HD voice coder for VoIP).
ADPCM coding is a predictive coding where the current sample is predicted by an adaptive predictor of ARMA (for “Auto Regressive Moving Average”) type on the basis of the past decoded values. By using the decoded values, the decoder can make the same prediction as the encoder. The adaptation of the predictor is also done on the basis of the decoded values (of the decoded signal and of the decoded prediction error), sample by sample, without additional information transmission.
The ADPCM encoder quantizes the difference between the prediction of the current sample and the actual value of the current sample (the prediction error) by using an adaptive scalar quantizer. The coded amplitude of the prediction error is composed of 2 parts: a constant part stored in ROM memory indexed by the scalar quantization indices and a multiplicative adaptive factor (in the linear domain) called the scale factor, whose adaptation is done without additional information transmission, sample by sample, as a function of the quantization index transmitted. Therefore, in the ADPCM bitstream, only the scalar quantization indices obtained by quantizing the prediction error sample by sample are transmitted. These scalar quantization indices are made up of a sign bit sign(n) and an amplitude quantization index I(n).
To decode the bitstream, the decoder performs a sample by sample inverse quantization of the prediction error using the inverse adaptive quantizer. The decoder also makes the same prediction of the current sample as that performed at the encoder, by using the same ARMA adaptive predictor (in the absence of transmission errors) adapted sample by sample. In the case of transmission errors, the predictor and the quantizer at the decoder diverge from those at the encoder. By virtue of the use of forget factors they generally re-converge in a few milliseconds. The decoded value of the current sample is obtained by adding together the prediction and the dequantized value of the prediction error.
In a transmission chain, in addition to coding and decoding, other signal processing procedures may be performed. It is possible to cite for example: the processings performed in conference bridges for mixing or switching all or part of the incoming streams so as to generate outgoing streams, those performed in conference bridges, gateways or peripherals to conceal lost packets or frames (“PLC: Packet Loss Concealment”) or to temporally adjust the signal (“time scaling”), or else the processings in communication servers or gateways to perform discrimination of contents or of voice.
Much of this processing works in the domain of the decoded signal, and this may require a re-encoding after processing. Such is the case, for example, for signals processed in conference bridges (for example mixing bridge) or in certain jitter buffers (e.g. DECT). The compressed frame arrives in a first coding format, it is decompressed, the decoded signal is analyzed to extract the data required for the processing (e.g. intermediate parameters and signals), and then the decoded signal is processed with the analyzed data, and the processed signal is then re-compressed into a format accepted subsequently in the communication chain.
There is therefore a cascading of a decoder and of a coder, commonly called a tandem. This solution is very expensive in terms of complexity (essentially because of the recoding) and it degrades the quality, since the second coding is done on a decoded signal, which is an already degraded version of the original signal, the degradations accumulating. Moreover, a frame may encounter several tandems before arriving at destination. The cost in terms of calculation and the loss of quality induced by such a system are readily imagined. Moreover, the delays related to each tandem operation accumulate and may be detrimental to the interactivity of the communications.
For certain equipment, the cost overhead in terms of memory and complexity to host an encoder and/or a decoder may be prohibitive. It may also happen that the decoder is not built into equipment in which the processing has to be performed, such as for example switching bridges or jitter buffers.
Certain remote decoders are in terminals of low capacity which are not able to perform such processings, this is the case for example for DECT terminals linked to domestic gateways more commonly called Boxes.
To remedy these drawbacks, there exist processing procedures without complete decoding of the signal.
Thus, for processings of error concealment type, there exist very non-complex procedures for concealing errors in the coded domain. Such techniques are for example described in annex A to ITU-T recommendation G.191 for techniques for concealing errors adapted to the G.722 coder. They consist in filling a coded frame either with indices corresponding to a signal set to zero over the whole of the duration of the frame or simply in repeating the bitstream of the previous frame. However, these techniques are of lower quality in comparison to the error concealment techniques described in appendices III and IV of ITU-T standard G.722. In these appendices, it is indeed necessary to analyze the decoded signal to extract diverse information about the signal to be reconstructed such as for example, the presence or the absence of vocal activity, the degree of stationarity of the signal, the nature of the signal, its fundamental period, etc.
This information is, for certain coders, for example, coders employing analysis by synthesis (CELP for “Code Excited Linear Prediction”), estimated at the encoder and coded so as to be transmitted in the bitstream. This is the case for example for the pitch period.
Thus, for this type of coder, it is possible to utilize this coded information to perform a processing in the coded domain. Such is the case for example for the technique described in patent application WO 2009/010831 where a processing in the coded domain of temporal adjustment type is proposed for the CELP analysis-synthesis compression technique.
This type of processing in the coded domain cannot be applied to ADPCM coding techniques, since in contradistinction to CELP technology, the parameters used in these processings are not calculated by the ADPCM encoder and are therefore not present in the ADPCM bitstream.
There therefore exists a need to perform processings of good quality in the coded domain for signals coded by the ADPCM coding technology without requiring decoding, even partial, of the signal.