Conventional analog telephone systems are being replaced by digital systems. In digital systems, the analog signals are sampled at a rate of about twice the band width of the analog signals or about eight kilohertz. In one type of system, each sample is then quantized as one of a discrete set of prechosen values and encoded as a digital word which is then transmitted over the telephone lines. With 8 bit digital words, for example, the analog sample is quantized to 2.sup.8 to 256 levels, each of which is designated by a different 8 bit word. In linear pulse code modulation systems, the 256 possible values of the digital word are linearly related to corresponding analog amplitudes.
Efforts have been made to reduce the number of bits required to encode the speech and obtain a clear decoded speech signal at the receiving end of the system. Because most speech is found at the lower analog signal amplitudes, encoding techniques have been developed which maintain high resolution at the lower amplitudes but which provide lesser resolution at the higher amplitudes. Such an approach has reduced the number of bits required in each word. An example of such an encoding technique is the .mu. law technique by which the quantization levels are based on a logarithmic function.
Yet another form of speech encoder, such as that of the linear predictive coding technique, is based on the recognition that speech signals are a combination of two basic signals. The pitch is determined by the vocal cord vibration and that actuating signal is then modified by resonance chambers including the mouth and nasal passages. For a particular group of samples, a digital filter which filters out the formant effects of the resonance chambers can be defined. The Fourier transform of the residual pitch signal can then the obtained and encoded. Because the baseband of the Fourier transform spectrum is approximately repeated in the higher frequencies, only the baseband need be encoded to still obtain reasonably clear speech. At the receiver, a definition of the formant filter and the Fourier transform baseband are decoded. The baseband is repeated to complete the Fourier transform of the pitch signal and the inverse transform of that signal is obtained. By applying the inverse of the decoded filter to the inverse Fourier transform of the repeated baseband signal, the initial speech can be reconstructed.
A major problem of this approach is in defining the formant filter which must be redefined with each window of samples. A complex encoder and decoder is required to obtain transmission rates as low as 16,000 bits per second. Another problem with such systems is that they do not always provide a satisfactory reconstruction of certain formants such as that resulting, for example, from nasal resonance.