In electronic signalling systems which communicate over the PSTN, echoes of transmitted signals (e.g., attenuated remnants of transmitted voice or tone signals) can appear along with received signals. This is due primarily to conversions between 4-wire and 2-wire circuits.
Connections to the PSTN are 2-wire circuits in which transmitted and received signals are simultaneously carried over a single pair of wires (e.g., the phone lines). The transmitted and received signals are superimposed upon one another (i.e., additively) such that a composite, full-duplex signal appears on the two wires, permitting simultaneous transmission and reception. In order to separate received signals from transmitted signals, a 4-wire to 2-wire conversion circuit is employed. This conversion circuit is commonly called a "hybrid," and operates by subtracting the transmitted signal from the composite (transmitted and received) signal so that only the received signal remains.
Hybrid circuits, however, are not perfect, and some amount of transmitted signal usually leaks through into the received signal. For voice-only telephone equipment, this does not pose much of a problem. In fact, some near-end feedback (or echo) of one's own voice (often referred to as "sidetone") is considered highly desirable in telephone handsets, and is specifically designed into virtually all telephones. For communications equipment, however, (e.g., fax machines, modems, voice-response systems, etc.), such reflections are not desirable, and it is essential to suppress as much of the transmitted signal as possible in the received signal.
FIG. 1 is a schematic of a simple telephone (systems) hybrid 100. The hybrid 100 is made up of two transformers 110 and 120. The transformer 110 has two identical primary windings 112 and 114 and a single secondary winding 116. The secondary winding 116 connects to the 2-wire PSTN. The transformer 120 has a primary winding 122 connected in series with the primary winding 112 of the transformer 110 in a 2-wire transmit circuit 130. The transformer 120 also has a secondary winding 124 connected in series with the primary winding 114 of the transformer 110 in a 2-wire receive circuit 132. Any transmit signal in the transmit circuit 130 passes through the primary winding 112 of the transformer 110 and through the primary winding 122 of the transformer 120. The transmit signal passing through the primary winding 112 causes a similar transmit signal to be imposed upon the 2-wire PSTN circuit. This transmit signal also appears in a composite received signal at the winding 114 in the 2-wire receive circuit. The secondary winding 124 of the transformer 120 is connected such that an induced signal therein (caused by the transmit signal passing through the primary winding 122) "bucks" (or cancels) the transmit signal in the 2-wire receive circuit 132, such that most of the transmitted signal from the 2-wire transmit circuit 130 is eliminated from the 2-wire receive circuit 132. The hybrid circuit of FIG. 1 is merely representative of hybrid circuits in general. Other hybrid circuits have been used and are well known to those of ordinary skill in the art.
Echo-cancellation systems are well known to those of ordinary skill in the art, and include a wide variety of techniques for cancelling single or multiple echoes of varying intensity and delay. One of the best known applications of such techniques is the use of echo-cancellation to eliminate far-end audible echoes in voice telephony. Another well-known application of echo-cancellation is the elimination of both near-end and far-end echoes in data modems. These techniques generally require highly-sophisticated adaptive digital echo-cancellation algorithms which can be extremely computation-intensive.
In tonal signalling systems, particularly DTMF (Dual-Tone Multi-Frequency, also known as "Touch-Tone") signalling systems, such as voice messaging and voice response systems, it is highly desirable that tonal signal detection be accomplished at the same time as other information (usually a voice message) is being transmitted so that the tonal signal (e.g., a Touch-Tone button press) can be used to interrupt the transmitted information. That is, the tonal signalling system is expected to operate in a full-duplex mode. This is quite unlike the typical PSTN, where DTMF signalling (dialling) occurs without interference from any other significant signal source in a half-duplex mode of operation.
Many techniques are known for detecting sinusoids in general and DTMF signals in particular. One such technique employs a discrete Fourier transform known as Goertzel's algorithm to detect the presence of sinusoidal signals. Goertzel's algorithm can repetitively be applied to detect each of the DTMF frequencies.
In full-duplex tonal signalling systems, a principal source of difficulty in detecting DTMF signals is near-end echo (which has a relatively short delay time associated with it). The tonal signalling source (e.g., a DTMF telephone) is at the far-end of the PSTN and any tonal signals originating therefrom must pass through all of the attenuation sources in the network. Any far-end echoes of signals transmitted from the near-end of the network must also pass through the same attenuations. As a result, the effective "signal-to-noise" ratio of tonal signal to far-end echo is relatively good and is not a significant contributor to tonal detection errors. Relatively larger near-end echoes, however, are likely to adversely affect tonal signal detection and can only be dealt with effectively by an echo-cancellation scheme.
Generally speaking, echo cancellation schemes attempt to characterize the echoes of a transmitted signal by correlating a composite signal (which includes the transmitted signal and echoes thereof) with the transmitted signal to determine the nature and delay of the echoes. The echoes (or a subset of the echoes) are then eliminated from the composite signal by creating "duplicate" (virtual) echoes and by cancelling (e.g., subtracting) them from the composite signal. Such echo cancellation schemes attempt to eliminate both near-end and far-end echoes of the transmitted signal.
In voice response systems which incorporate speech recognition, it is likely that "command" words (which are subject to action when recognized) will occur in the outgoing message from the voice response system. If large-amplitude near-end echoes of these command words are not cancelled (i.e., echo-cancelled), then the speech recognition apparatus will recognize and act upon them as though they were received signals (rather than echoes of transmitted signals), causing undesired (and typically erroneous) results.
One approach to near-end echo cancellation is described in "Fast Echo Cancellation in a Voice-Processing System," by Vijay R. Raman and Mark R. Cromack, IEEE Publication Number 0-7803-0532-9/92, September 1992, at pages IV-513 through IV-516. FIG. 2 is a block diagram of the adaptive echo cancellation system 200 described therein.
In FIG. 2, the adaptive echo-cancellation system 200 consists of an echo canceller 210, a speech decoder (transmitter) 290, and two or more receivers (two are shown) 270 and 280. The receiver 270 is a DTMF decoder and the receiver 280 is a speech recognizer. Receive line 212 and transmit line 214 are assumed to come from a system hybrid. The receive line 212 carries a receive signal which has remnants (echoes) of a transmitted signal sent out over the transmit line 214. (Compare with receive and transmit circuits 132 and 130, respectively, in hybrid 100, FIG. 1).
The echo-canceller 210 includes two separate filters, i.e. an adapter filter and a canceller filter. The adapter filter includes an adaptive control function 220, an adapt/window module 230 and a difference function (e.g., adder) 250. The canceller filter includes a cancel module 240 and a difference function (e.g., adder) 260. The adapter filter (220, 230, 250) provides essentially a system identification function, because it does not adapt in real-time on all samples of receive and transmit data. The adaptation operates only on buffered frames of time-aligned transmit and receive data. The completion of adaptation for a frame of data is spread out in time over a number of elapsed frames.
The adaptation control function 220 and adapt/window module 230 form an adaptive filter which determines the appropriate delay and coefficients to be used for cancellation by monitoring the transmit and receive lines 214 and 212, respectively, and producing filter coefficients which, when applied to the transmit signal on the transmit line 214, produce an adaptive filter output which closely matches the transmit signal echo in the receive signal on the receive line 212. This adaptive filter output is then subtracted from the received signal via the difference function 250. The difference is then monitored by the adapt/window module 230 which tunes the filter coefficients for a minimum difference signal. The adapter filter (220, 230, 250) has available more and higher-resolution coefficients than the cancellation filter (240, 260). The adapt/window module 230 includes a windowing function which selects a subset of the available filter coefficients and delay constants based upon an energy concentration technique. Using this technique, a small set of coefficients and delay constants is selected to have the greatest effect on the highest energy components of the filtered signal. Filter coefficients and delay constants which affect only low-energy signal components are discarded. This effectively produces a filter which "windows" or selectively targets only the highest energy components (i.e., the largest amplitude reflections) in the filtered signal. The "windowed" filter coefficients and delay values are then passed to the cancel module which uses them to produce an echo-cancellation signal. The echo cancellation signal is subtracted from the received signal in the difference function 260 in an open-loop fashion.
In the echo-cancellation scheme shown and described above with respect to FIG. 2, adaptation is only performed off-line, in a non-real-time manner on buffers that pass a minimum power requirement. Such a scheme has several disadvantages. First, the adaptation can only be performed off-line, and requires completely separate filters for adaptation and cancellation. If implemented in a DSP (Digital Signal Processor) this would mean that separate program memory and coefficient storage are required for each of the two filters. Since adaptation and cancellation do not occur in parallel, there must be a "line acquisition" phase during which the process of adaptation occurs. During this phase, there can be no communication, and consequently no DTMF or other tonal signal detection.