For the purpose of data compression, samples of a digital audio signal are frequently coded using entropy coding. However, such samples typically exhibit substantial temporal correlation, leading to redundancy in the compressed stream if they are coded independently. If this redundancy is removed or reduced, the compressed data rate can be reduced further. For example, in a transform based encoder, the correlation causes the signal power to be concentrated in a few of the transformed bands, and the remaining, low-power bands can be adequately represented by a smaller number of bits.
In an Adaptive Differential Pulse Code Modulation (ADPCM) encoder, the current audio sample is predicted on the basis of previous samples, and the predicted value subtracted from the actual current value audio to leave a ‘residual’. An approximation to the residual is communicated over a transmission channel to the ADPCM decoder which similarly computes a predicted value. The decoder then adds the residual to the predicted value in order to reconstruct an approximation to the original audio sample. The sequence of residual samples is known as an ‘innovation signal’ because it contains that which was not predicted and is therefore new, and the ratio between the audio signal power and the innovation signal power is termed the prediction gain.
It is generally desirable to maximise the prediction gain, i.e. to minimise the innovation signal power, in order to minimise the data rate required to transmit the innovation signal. Linear prediction is almost invariably used and, according to linear prediction theory, the prediction gain is maximised when temporal correlation has been removed. This criterion is equivalent to saying that, viewed in the frequency domain, the innovation signal has a white spectrum. The process of subtracting a linear prediction from the input signal can be expressed as a filtering operation performed by a filter whose z-transform response is W(z). In order to produce a white innovation spectrum the filter W(z) needs to have an amplitude response equal to the inverse of the amplitude spectrum of the input audio signal. Frequently, an adjustable or reconfigurable filter is used in order to follow, at least to some extent, variations in the input spectrum.
The prediction algorithms in an ADPCM encoder and in the corresponding decoder should be kept in step with each other so that the decoder can accurately invert the encoder's prediction processing at all times and there are three methods of achieving this. A fixed predictor can be used, or the encoder can choose suitable predictor settings from time to time and communicate these settings to the decoder, or the encoder and decoder can both use a common method of adapting predictor settings from the innovation signal as conveyed over the transmission channel. This last method is termed backwards adaptive prediction and the invention concerns a method for backwards adaptive prediction.
Backwards adaptive prediction requires of the predictor adaptation method not only that it should result in a predictor having a useful prediction gain, but also that it should be robust against transmission errors. That is, when implemented in both an encoder and a decoder, the decoder's prediction settings should converge to those of the encoder, for example on start up or after a transmission error. Moreover, a transmission error should not perturb the decoder's setting more than necessary. It is usual to employ a filter of fixed architecture but with dynamically adjustable coefficients.
Prediction filters may be implemented digitally in a transversal (Finite Impulse Response or FIR) or a recursive (Infinite Impulse Response or IIR) structure, or a combination of both. Known algorithms for adapting or training a filter include the least-mean-squares (LMS) algorithm of Widrow and Hoff (Bernard Widrow, Samuel D. Stearns in “Adaptive Signal Processing”, Prentice Hall, 1985, ISBN 0-13-004029-0) and its variants. In the context of an encoder, this algorithm explicitly attempts to minimise the power in the residual signal. In order to keep the adaptations within the encoder and decoder in step with each other, both must operate from the same signal, and to this end the encoder quantises the signal for transmission and then uses an inverse quantiser to reconstruct a signal from which both the encoder and the decoder can adapt their predictors.
Whether or not in the context of an ADPCM codec, it is conventional to incorporate ‘leakage’ into an LMS implementation, to improve the stability of the adaptation and prevent filter coefficients from wandering in the presence of arithmetic rounding errors. In an ADPCM codec, however, prediction gain will generally be significantly reduced if sufficient leakage is applied to provide adequate stability of the adaptation. It would therefore be desirable to provide high prediction gain while maintaining good stability and convergence properties in the adaptation.