a. Field of the Invention
Broadly speaking, this invention relates to telecommunications. More particularly, in a preferred embodiment, this invention relates to methods and apparatus for processing digitized human speech to reduce the bit rate required to transmit the information signal over a digital telecommunications link.
B. Discussion of the Prior Art
In recent years, considerable attention has been focused on digital transmission and switching systems. Such systems are in many ways superior to the analog systems heretofore employed and offer the further advantage that the digital signal may be encrypted prior to transmission, an important fact where security is of concern.
Since human speech is analog in nature, it is necessary to employ some form of analog-to-digital conversion prior to transmission over the digital system. Several approaches have been employed in prior art including the so-called "wideband" and "narrowband" conversion techniques. The "wideband" techniques include pulse code modulation (PCM), which results in an output bitstream, say, at a 48 or 64 kb/s rate, or delta modulation, for example, Continuously Varying Slope Delta Modulation (CVSD) which results in an output bitstream at a 32 or 16 kb/s rate.
The "narrowband" techniques include the so-called "VOCODER" approach, or any of several more sophisticated converters which utilize some sort of adaptive or predictive technique. The "narrowband" techniques result in an output bitstream of from 1.2 or 2.4 kb/s to 9.6 kb/s, depending upon the desired voice quality and the particular technique which is used.
The problem is that "narrowband" converters are expensive and the voice quality that they yield, especially speaker recognizability, is marginal at best. The "wideband" converters, especially those employing Delta Modulation, are far more successful, are inexpensive, and deliver good performance at bit rates of 32 kb/s and even 16 kb/s. Unfortunately, these high bit rates limit the number of voice circuits that can be multiplexed over a transmission link of a given bandwidth.
Now, it is a known fact that the speech pattern of the average talker is replete with numerous inter-syllable and inter-word pauses. It is, in other words, highly redundant from an information transfer viewpoint. This redundancy becomes even greater when one considers the fact that at least some of the time the talker is silent while listening to the talker at the other end of the circuit.