Adaptive predictive coding (APC) methods are widely used for high quality coding of speech signals at 16 kbit/s. An adaptive predictive coder digitizes an input signal by performing two basic functions: adaptive prediction and adaptive quantization. The adaptive prediction function removes the redundancies inherent in any information carrying signal such as speech. The residual nonredundant signal is then quantized by the adaptive quantization function. Various realizations of the above basic concept are possible, differing mainly in the method of residual quantization. In the most common approach, the residual nonredundant signal is quantized in the time domain, within a feedback loop. This arrangement will be referred to as the conventional APC or the APC with noise feedback (APC-NFB).
FIGS. 1 and 2 show block diagrams of the conventional encoder and decoder respectively. Since input signals such as speech have time varying characteristics, the predictor and quantizer circuits included in the adaptive predictive coder must adapt to match the time varying input signal. The conventional APC schemes are block adaptive in that the signal is processed in blocks, or frames, of samples and optimal predictor and quantizer parameters are computed for each block (frame). These parameters are also quantized and transmitted to the decoder at the receiving end of the transmission system.
In the conventional APC encoder, two stages of prediction are performed. A short term prediction circuit 4 in FIG. 1 removes redundancies by subtracting from each signal sample stored in frame buffer 1 its predicted value which is based on a predetermined number of immediately preceding samples (See L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals, Prentice-Hall, Inc., Englewood Cliffs, N.J., 1978 and J. D. Markel and A. H. Gray Jr., Linear Prediction of Speech, Spinger-Verlag, N.Y. 1976) and is calculated by the short term prediction analysis (linear prediction coding-LPC) circuit 2 and quantized by the short term (LPC) prediction parameter quantization circuit 3. Typically 8-16 previous samples are used for predicting the present sample. The difference between the actual and the predicted samples is called the prediction error p[i]. This error displays very small short term redundancies and its variance is significantly lower than that of the input signal. For speech signals, this form of prediction has the effect of removing the formant resonances introduced by the vocal cavity.
Even though the prediction error has no short term redundancies, it may exhibit redundancies over long delays. An example is the prediction error that results during a voiced sound. The periodicity that characterizes the voiced speech signal remains in the prediction error. A long term predictor 10 removes redundancies of this nature by subtracting from each prediction error sample, output from the short term prediction circuit 4, its predicted value based on prediction error samples delayed by exactly one "period". Typically, a period value ranges over 20-147 samples and three samples are used in the prediction. This error in prediction is called the long term prediction error. The long term prediction analysis (pitch prediction analysis) circuit 8 calculates the long term predictor parameter and the long term prediction (pitch predictor) parameter quantization circuit 9 quantizes the parameter.
The long term prediction error is a highly uncorrelated signal and statistically resembles a white Gaussian noise sequence. These properties are well suited for efficient quantization.
The samples of the long term prediction error, also referred to as the residual signal r[i], are quantized by a 2 bit/sample uniform midrise quantizer 14. (See B. S. Atal, "Predictive Coding of Speech at Low Bit Rates", IEEE Trans. on Communications, Vol. Com-30, No. 4, April 1982).
An important quantity to be considered during quantization is the quantization noise q[i], which is the difference between the quantizer input w[i]- and the quantizer output r'[i]. In quantizing the residual samples r[i], it is necessary to insure that the quantization noise frequency spectrum possesses the proper power distribution. The quantization noise acts as the excitation to a synthesis filter cascade in the decoder at the receiving end of the transmission system and generates the reconstruction noise (the difference between the input and reconstructed signals). It is desirable that the reconstruction noise be white noise i.e., a flat power spectrum (as in ADPCM), or slightly resemble the signal spectrum to take advantage of a phenomenon known as auditory noise masking. This is accomplished in the conventional APC coder by summing with the residual signal r[i], a filtered version q'[i] of the quantization noise q[i], prior to quantization. (See N. S. Jayant and P. Noll, Digital Coding of Waveforms, Prentice-Hall, Inc., Englewood Cliffs, N.J., 1984). A Noise Spectral Shaping Filter 16 performs the required filtering. The filter 16 transfer function is closely related to the transfer functions of the short term and long term predictors discussed above.
The short term predictor 4 transfer function can be expressed as ##EQU1## where M is the short term prediction order and {a[m], 1.ltoreq.m.ltoreq.M} are the Linear Prediction Coding (LPC) coefficients. The long term predictor 10 transfer function can be expressed as ##EQU2## where p is the period and {c[m], p-1.ltoreq.m.ltoreq.p+1} are the long term prediction parameters. Then, the desired spectral shaping is accomplished by using a feedback filter 16 with the transfer function F[z] given by EQU F[z]=(1-C[z])A[z/.beta.]+C[z]
where .beta. is a constant to control residual spectral shaping to thereby control auditory noise masking. .beta. usually assumes a value between 0.7 and 0.9.
A decoder shown in FIG. 2 reconstructs the signal based on the received long term residual signal and the predictor parameters. The predictor parameters are decoded by pitch decoder 23 and LPC decoder 24 and essentially contain information about the redundancies that must be reintroduced into the prediction error signal to reconstruct the signal. First, the long term synthesizer 25 which is the inverse of the long term predictor 10, replaces the long term redundancies. Then, the short term synthesizer 28, whose transfer function is the inverse of that of the short term predictor 4, reintroduces the short term correlations. The output of the short term synthesizer is the reconstructed signal.
The noise feedback quantization technique used in the conventional APC shown in FIGS. 1 and 2 has two main disadvantages. First, as a result of the noise feedback, the variance of the signal at the quantizer input is higher than that of the residual signal. Since a 2-bit/sample quantizer is being used, this differential can be substantial. This results in higher reconstruction noise variance. Secondly, the feedback loop may become unstable if the power gain through the feedback filter becomes large. For highly resonant signals such as sine waves and many voiced speech signal frames, the gain of the noise feedback can be quite high (&gt;20 dB). If this power gain through the filter exceeds the signal to quantization noise ratio, the feedback loop may become unstable. Maintaining stable operation is possible by controlling the power gain of the filter, but this is accomplished at the expense of a loss in the overall performance of the system.