It is well known in the prior art to digitize the analog signal output of, e.g., a telephone microphone, representing a voice signal, in order to transmit the digital data over a telephone line. The digitized signal, which may be converted back to an analog representation of the digital data for the purpose of transmission over the telephone lines, as is also well known in the art, is less susceptible to noise on the telephone line, is capable of multiplexed channel operation in the telephone bandwidth, reduces crosstalk and enables relatively easy digital encryption for secure transmission.
The digitized voice signal, which typically is at, e.g., 64,000 bps, cannot be readily sent within the available approximately 3,000 Hz bandwidth of the telephone lines or be readily sent multichannel at that bit rate within the available telephone line bandwidth. To enable more convenient transmission and/or multiplexed transmission, compression of the 64,000 bps digitized voice signal is employed, as is known in the art. One method of compressing the 64,000 bps to, e.g., 2,400 bps is to use a linear predictive coding technique known in the art, a discussion of which is found, for example, in Markel, Gray, Jr. & Wakita, "Linear Prediction of Speech Theory and Practice," Speech Communication Research Laboratory, Inc. Monograph No. 10 (1978), the disclosure of which is hereby incorporated by reference. With compression to 2,400 bps, four simultaneous channels of 2,400 bps each can be multiplexed via a modem onto a 9,600 bps data stream transmitted over the bandwidth of the analog telephone lines.
The linear predictive coding technique employs digital filtering of the 64,000 bps digitized voice in digital resonating filters. Only digits representative of fundamental frequencies within the analog voice signal are selected in the compression to 2,400 bps. Speech is made up of pitch (voiced and unvoiced) and amplitude components, with the pitch being derived from the action of the human vocal cords. Pitch ranges vary from adult males, 50-150 Hz, adult females, 90-450 Hz and children, 125-575 Hz. Thus, much of the fundamental frequencies in the voice of a telephone talker is eliminated from the approximately 300-3000 Hz telephone bandwidth. However, fundamental pitch frequencies can be determined from, e.g., second or third harmonics. Thus, e.g., 360 Hz is the second harmonic of 180 Hz and the third harmonic of 120 Hz. It is thus possible in the linear predictive compression technique to transmit a compressed form of digitized speech representing the fundamental pitch frequencies, the voiced speech component, and the unvoiced speech component to synthesize these at the receiver end to simulate actual speech, as is all well known in the art. In synthesizing speech at the receiver end, the unvoiced components are represented by white noise, i.e., random binary bits, which, when synthesized with the voiced components in the proper proportion, result in simulating actual speech, when the resulting synthesized digital signals at the receiver end, now expanded to 64,000 bps, are passed through a digital-to-analog converter, as is also known in the art. The linear predictive coding technique uses a linear prediction algorithm, e.g., LPC-10.
Fundamentally what the linear predictive coding does is to generate a set of, e.g., 10 numbers (envelope prediction factors) per frame at the transmitter, based upon the actual data taken from the analog-to-digital conversion of the analog speech signal, at 64,000 bps. These 10 numbers enable the receiver end to generate by use of the linear predictive algorithm a full set of, e.g., 180 points per frame, e.g., 64,000 bps digitized voice. The 10 numbers per frame, plus six bits representing pitch, six bits representing RMS gain and a sync bit are transmitted every frame, which amounts to 2400 bps. A frame in the example of the present invention is 22.5 milliseconds. The 10 numbers are generated in the transmitter from an analysis of the envelope of the digitized voice signal in the frequency domain, and enable the reconstruction of the envelope at the receiver end. Those skilled in the arts of speech synthesis and compression will recognize the terms "gain" and "RMS gain" as used herein refer to a digitally coded speech synthesis parameter used in LPC and other speech synthesis techniques. In the speech synthesis and compression arts, these terms generally relate to the energy, power or signal level averaged over a short sample of speech. A more precise mathematical definition of the term may be found in numerous references related to speech analysis and synthesis using Linear prediction techniques, e.g. "Digial Processing of Speech Signals," by Rabner and Schaefer, Prentice-Hall, Inc., 1978, pp. 396-455.
There are presently in use both two-wire and four-wire telephone transmission links. In the two-wire system, analog speech or data signals are transmitted in both directions with two wires. In the four-wire system there are two wires, each with an associated ground wire, e.g., one for transmitting and one for receiving.
Because of processing delays inherent in the compression/expansion of digitized voice, echo suppression is of crucial importance. The prior art has not adequately solved this problem.
The present invention relates to echo suppression, which includes software implementation in the present invention enabling suppression of the same echo signal at its originating end of the transmission link and at the receiver end, and also includes use of, e.g., an operational amplifier hybrid circuit.
The problems enumerated in the foregoing have not been intended to be exhaustive, but rather are representative of problems which have tended to impair the effectiveness of limited bandwidth, e.g., telephone bandwidth, digital voice transmission apparatus used in the prior art, particularly those using multichannel transmission within the telephone bandwidth. Other noteworthy problems may also exist; however, those presented above should be sufficient to demonstrate that limited bandwidth, e.g., telephone bandwidth digital voice transmission apparatus appearing in the prior art have not been altogether satisfactory.
Similarly, the foregoing examples of the more important features of the present invention have been given rather broadly in order that the detailed description thereof which follows may be better understood and the contribution to the art better appreciated. There are, of course, additional features of this invention that will be described hereinafter and which will form the subject matter of the appended claims. These other features of the present invention will become apparent with reference to the following detailed description of a preferred embodiment of the invention in connection with the accompanying drawings, wherein like reference numerals have been applied to like elements, in which: