The present invention relates to a circuit for digitizing analog speech, transmitting it over such channels as telephone lines, and converting it back into analog speech at the receive end.
The basic problem which has existed with regard to the digitization and transmission of analog speech is the fact that sampling the zero to three kilohertz range of human speech at a rate high enough to satisfy the Nyquist criterion of sampling at a frequency of twice the bandwidth would result in a sampling rate of approximately 8 kilohertz given the inaccuracies of typical low pass filters. Assuming that 10-bits would be sufficient to describe the amplitude of the speech wave for each sample, the required bit transmission rate would be 80 kilobits per second, a figure far in excess of the capacity of such channels as ordinary telephone lines.
A technique which has been developed to somewhat alleviate this problem is generally called linear predictive coding. Linear predictive coding (LPC) uses a parametric model of the human vocal system to encode speech. This model describes speech production as being controlled by three factors: the excitation source, the energy (or gain) of the signal, and the shape of the acoustic cavity from the epiglottis to the lips. Speech signals can either be voiced such as "a" in (ape) or unvoiced "s" in (sister). The excitation mechanism for the voiced signals is modeled by a series of pulses separated by a fixed pitch. The excitation source for the unvoiced signals is modeled as a noise generator. The shape of the acoustic cavity is represented by a plurality of resonant circuits tuned to give information regarding the natural frequencies of the analog speech.
The linear predictive coding technique takes advantage of the fact that many speech parameters will not change for a considerable number of samples during a typical speech pattern. Thus, linear predictive coding models typically use an analysis frame containing many samples to arrive at a composite profile for the speech frame before transmitting information on the channel. A commonly used analysis frame duration is 180 samples. Thus the channel bit transmission rate can be to the order of a few kilobits per second, a number which such channels as ordinary telephone lines are capable of transmitting.
The linear predictive coding technique has been discussed in the following technical papers.
A. Buzo et al., "Speech Coding Based Upon Vector Quantization", IEEE trans on ASSP, October 1980, Atal, B. S. and Remde J. M. "A New Model of LPC Excitation . . . ", Proceedings 1982 ICASS Ppp 614-617, Parker et al "Low Bit Rate Speech Enhancement . . . ", Proceedings 1984 ICASSP; pp. 1.5.1-1.5.4.