1. Field of the Invention
The present invention relates to digital voice transmission over telephone lines in which a digitized voice signal of, e.g., 64,000 bits per second (bps) is compressed to, e.g., 2,400 bps for transmission over the telephone bandwidth and an automatic gain control circuit is employed to facilitate the operation of the digitized voice compression circuits, along with echo suppression being used to improve system operation.
2. Background and Summary of the Invention
It is well known in the prior art to digitize the analog signal output of, e.g., a telephone microphone, representing a voice signal, in order to transmit the digital data over a telephone line. The digitized signal, which may be converted back to an analog representation of the digital data for the purpose of transmission over the telephone lines, as is also well known in the art, is less susceptible to noise on the telephone line, is capable of multiplexed channel operation in the telephone bandwidth, reduces crosstalk and enables relatively easy digital encryption for secure transmission.
The digitized voice signal, which typically is at, e.g., 64,000 bps, cannot be readily sent within the available approximately 3,000 Hz bandwidth of the telephone lines or be readily sent multichannel at that bit rate within the available telephone line bandwidth. To enable more convenient transmission and/or multiplexed transmission, compression of the 64,000 bps digitized voice signal is employed, as is known in the art. One method of compressing the 64,000 bps to, e.g., 2,400 bps is to use a linear predictive coding technique known in the art, a discussion of which is found, for example, in Markel, Gray, Jr. & Wakita, "Linear Prediction of Speech Theory and Practice," Speech Communication Research Laboratory, Inc. Monograph No. 10 (1978), the disclosure of which is hereby incorporated by reference. With compression to 2,400 bps, four simultaneous channels of 2,400 bps each can be multiplexed via a modem onto a 9,600 bps data stream transmitted over the bandwidth of the analog telephone lines.
The linear predictive coding technique employs digital filtering of the 64,000 bps digitized voice in digital resonating filters. Only digits representative of fundamental frequencies within the analog voice signal are selected in the compression to 2,400 bps. Speech is made up of pitch (voiced and unvoiced) and amplitude components, with the pitch being derived from the action of the human vocal cords. Pitch ranges vary from adult males, 50-150 Hz, adult females, 90-450 Hz and children, 125-575 Hz. Thus, much of the fundamental frequencies in the voice of a telephone talker is eliminated from the approximately 300-3000 Hz telephone bandwidth. However, fundamental pitch frequencies can be determined from, e.g., second or third harmonics. Thus, e.g., 360 Hz is the second harmonic of 180 Hz and the third harmonic of 120 Hz. It is thus possible in the linear predictive compression technique to transmit a compressed form of digitized speech representing the fundamental pitch frequencies, the voiced speech component, and the unvoiced speech component to synthesize these at the receiver end to simulate actual speech, as is all well known in the art. In synthesizing speech at the receiver end, the unvoiced components are represented by white noise, i.e., random binary bits, which, when synthesized with the voiced components in the proper proportion, result in simulating actual speech, when the resulting synthesized digital signals at the receiver end, now expanded to 64,000 bps, are passed through a digital-to-analog converter, as is also known in the art. The linear predictive coding technique uses a linear prediction algorithm, e.g., LPC-10.
Fundamentally what the linear predictive coding does is to generate a set of, e.g., 10 numbers (envelope prediction factors) per frame at the transmitter, based upon the actual data taken from the analog-to-digital conversion of the analog speech signal, at 64,000 bps. These 10 numbers enable the receiver end to generate by use of the linear predictive algorithm a full set of, e.g., 180 points per frame, e.g., 64,000 bps digitized voice. The 10 numbers per frame, plus six bits representing pitch, six bits representing RMS gain and a sync bit are transmitted every frame, which amounts to 2400 bps. A frame in the example of the present invention is 22.5 milliseconds. The 10 numbers are generated in the transmitter from an analysis of the envelope of the digitized voice signal in the frequency domain, and enable the reconstruction of the envelope at the receiver end.
One problem with a technique like linear prediction coding is that the amplitude of the speech must be within certain limits for the prediction coding algorithm to properly analyze the digital representations of the analog speech signal to result in an accurate generation of the envelope prediction factors. Digitally controlled AGC circuits, including those in which the signals used for setting the gain of the AGC are maintained in a stored memory, are known, in the art, as shown, e.g., in U.S. Pat. Nos. 4,213,097 to Chiu et al. (assigned to the assignee of the present application); 4,016,557 to Zitelli et al.; 3,464,022 to E. W. Locheed, Jr., et al.; 3,699,325 to Montgomery, Jr., et al.; 3,813,609 to Wilkes, et al.; 3,562,504 to Harris. The actual gain value sent to the receiver at the end of the transmission link may not be accurate if gain has been adjusted at the transmitter end in order to keep the gain within the linear prediction coding analyzer's limits. The prior art has not adequately solved these problems. U.S. Pat. No. 4,230,906 to Davis describes an AGC circuit for stabilizing speech waveforms for pitch. The function described is pitch detection and the stabilization is to provide a uniform input in pitch, i.e., uniform amplitude for the purpose of period detection, not for envelope detection.
There are presently in use both two-wire and four-wire telephone transmission links. In the two-wire system, analog speech or data signals are transmitted in both directions with two wires. In the four-wire system there are two wires, each with an associated ground wire, e.g., one for transmitting and one for receiving.
Because of processing delays inherent in the compression/expansion of digitized voice, echo suppression is of crucial importance. The prior art has not adequately solved this problem.