Coding of signals for transmission over digital channels in telephone or other communication systems generally requires sampling an input signal, quantizing the samples, and generating a digital code for each quantized sample. Speech signals are highly correslated and therefore include a part which can be predicted from its past values. If a digital encoding transmitter and a decoding receiver both comprise apparatus for forming the predicted portion of the highly correlated speech signal, only the unpredicted part of the speech signal need be encoded and transmitted. Consequently, predictive coding of speech signal results in efficient utilization of digital channels without signal degradation.
Predictive speech signal coding as disclosed in U.S. Pat. Nos. 3,502,986 and 3,631,520 involves generation of predictive parameters from a succession of speech signal samples and the formation of a predicted value for each speech signal smaple from the generated parameters and the preceding speech signal samples. The difference between each sample and its predicted value is quantized, digitally encoded and sent to a receiver wherein the difference signal is decoded and combined with the coresponding predicted value formed in the receiver. In this manner, only the signal part which cannot be predicted from the already coded signal is quantized and transmitted whereby a savings in channel capacity is achieved. The savings is reflected in the reduced bit rate needed for transmitting only the unpredicted portion of the redundant speech signal as opposed to the much higher bit rate for transmitting the directly coded speech signal.
The quantizing of signal samples is generally accomplished by selectively generating a signal corresponding to the level of a set of specified amplitude levels that is nearest the amplitude of the signal sample. The error produced by quantization, however, distorts the transmitted signal. As disclosed in U.S. Pat. No. 2,927,962 the noise produced by quantization may be reduced by forming an error signal corresponding to the difference between quantized and unquantized signal samples and modifying the signal samples in a predetermined manner responsive to the error signal. While the total quantizing noise power is unaffected by the mofifying arrangements, the noise power may be concentrated in a specified portion of the signal spectrum where its effects are minimized. A feedback filter arrangement utilizing this principle in television signal coding to place the quantizing noise in the upper frequency range of the signal band is disclosed in the article, "Synthesis of Optimal Filters for a Feedback Quantization System," by E. G. Kimme and F. F. Kuo, IEEE Transactions on Circuit Theory, September 1963, pp. 405-413.
The aforementioned quantizing error reduction arrangements which generally concentrate the error power in fixed portions of the frequency spectrum to minimize the RMS error power do not result in optimum noise reduction for speech signal encoding arrangements. The lack of optimum noise reduction esults from the nature of the speech signal spectrum which includes a plurality of time varying formant frequency portions corresponding to portions of the short term spectral envelope where speech energy is concentrated, and interformant portions. In voiced regions of speech, the formant portions are directly related to resonances in the vocal tract. The speech signal power is therefore concentrated in said formant portions while interformant regions contain substantially less speech signal power. Consequently, concentrating the quantizing error power in a fixed portion of the frequency spectrum does not take into account the relationship between the quantizing noise spectrum and the changing speech spectrum so that noticeable noise effects remain.