Echo-canceling systems and methods are often required in communication systems in order to prevent users from receiving a copy of their own speech signal. It will be understood that although these systems and methods are referred to herein as "echo-canceling", they do not typically completely eliminate echo. Rather, they reduce echo from that which would be present without the echo-canceling system or method.
An echo can be disturbing if it is received by the user at a high level. The precise level at which the echo becomes disturbing is inversely related to the delay in the echo path. If the delay is short, a relatively high level of echo is actually desirable; hence the "sidetone" produced in a normal telephone handset. However, as the delay increases beyond 30-50 milliseconds, even minute echo signals can become very annoying to the user and disruptive to conversation.
In the public switched telephone network (PSTN), a principal source of echo is the impedance mismatch appearing at the hybrids used to interface two-wire and four-wire equipment. In the simplified PSTN telephone connection shown in FIG. 1, hybrid 120 directs signal energy arriving from telephone 100 via two-wire segment 110, to the two wire segment 180 of telephone 150 via PSTN 130, without allowing the signal to return to telephone 100. The impedance mismatch at hybrid 140 results in an echo signal 160 being returned to telephone 100. This is the source of so-called "talker echo." Echo 170 is similarly produced at telephone 150.
The subjective effect of the talker echo depends upon the length of the transmission channel between the two hybrids 120 and 140. The human ear is well-adapted to receive short delays (for example 20-30 milliseconds). Therefore, echoes returning within this time period are not annoying even if the level of the echo is, for example, only 6 dB below the speaker's voice. If it is assumed that the effect is the same for both ends of the connection, each will experience a similar echo. For satellite communications where the delay can be as much as 540 milliseconds, even an echo 30-40 dB below the speaker's voice can be very annoying. Because of this, much effort has been directed towards cancellation of such echoes.
Referring now to FIG. 2, for reasons of safety, users of vehicular (e.g. automotive) mobile radio telephones 250 often make use of the so-called "hands-free" configuration in vehicle 200. When in "hands-free" mode, the user 210 speaks into an external microphone 220 (shown by arrow 270) which is mounted in the vehicle interior 305, and listens to a remote loudspeaker 230 which is also mounted in the vehicle interior 305, thereby keeping the user's hands free to operate the vehicle. The microphone 220 is usually attached to the vehicle's sun visor or otherwise located in close proximity to the user's mouth, and the loudspeaker 230 is preferably located behind the microphone 220. If the microphone 220 has good directivity, this arrangement reduces the direct acoustic coupling 260 between microphone 220 and loudspeaker 230.
When, as illustrated in FIG. 2, a connection exists between a "hands-free" mobile radio and the PSTN 225, via radio link 295, base station 245 and mobile telephone switching office (MTSO) 235, there is an acoustic echo in addition to the signal echoes inherent in the PSTN. Because of the enclosed and confined nature of a vehicle interior 305, the microphone 220 not only receives the desired voice signal 270 but also receives acoustic signals from the loudspeaker 230. These acoustic echoes reach the microphone 220 at varying signal levels and delays depending upon the path traveled, such as path 280. It is not unusual to find that echoes which reach the microphone 220 with little more than 6 dB of attenuation or path loss.
Still referring to FIG. 2, the acoustic echo generated by the hands-free equipment is most bothersome to a caller on telephone 255 in the PSTN 225. In a digital cellular system there is a brief processing delay which results from the finite time it takes to demodulate the digitized speech and to then reconstitute it into an analog voice signal. In normal conversation (without echoes) this delay is nearly imperceptible. However, if there is an acoustic echo, the caller on telephone 255 hears an echo of his own voice at a significant level and with a lengthy (150-200 millisecond) delay, thus making normal conversation very difficult. This is not limited to a PSTN connection, but will also exist if the connection occurs between two vehicles 200a, 200b using cellular system 300 as shown in FIG. 3. In fact, if both vehicles make use of a hands-free system, both users will experience an echo of their own voice.
A conventional approach to canceling acoustic echoes is to use voice-switched attenuators as illustrated by the simplified schematic diagram of FIG. 4. Voice activity detectors (VAD) 420, 425 detect when speech signals are present at the VAD input. With regard to FIG. 4, input signals from the microphone 220 are coupled to one VAD 420, and signals intended for the loudspeaker 230 are coupled to another VAD 425. The output of each VAD 420, 425 is coupled to a decision block 460 whose function is to decide which, if any, of the two attenuators 410a, 410b should be applied. If a speech signal is present at the output of the microphone 220 and no speech signal is being directed to the loudspeaker 230, then the attenuator 410a, located between the radio transceiver 250 and the loudspeaker 230, is activated, thereby preventing any sound from coming out of the loudspeaker 230 while the microphone 220 is active. The result is essentially half-duplex communication.
Experience with current "hands-free" cellular telephones shows that severe difficulties arise when using voice-switched attenuators in a mobile environment. Voice activity detectors require a finite time to decide whether or not speech is present, hence they may lead to the clipping of the first few speech syllables. In addition, the ambient noise in a mobile environment makes it very difficult for most known voice activity detectors to operate reliably. Voice switched attenuators systems are described in a publication by Burnett et al. entitled Echo Cancellation in Mobile Radio Environments, IEE Colloquium on Digitized Speech Communication via Mobile Radio, (Digest No. 139) pp. 7/1-4, December 1988.
In digital radiotelephone systems, a conventional approach to eliminate signal echoes, such as which occur within the PSTN, is to use some form of echo canceler. An echo-canceler can also be used in analog systems. A conventional echo-canceler is shown in FIG. 5. A variable finite, or infinite, impulse response (FIR/IIR) filter 550 is used to construct a model of the echo path. A lattice filter can also be used. The filter parameters may be determined using an adaptive approach, or may be fixed, depending upon the echo environment. The echo canceler generates an echo replica by passing part of the acoustic signal 525 intended for the loudspeaker 230 through the filter 550. The signal replica is passed to a summing junction 540 where it is subtracted from the microphone's composite signal. If the replica is perfect, then only the desired voice signal remains in the output 590 and the echo is thereby removed.
Echo cancelers can be very effective in applications where the echo channel does not vary. If the echo channel is dynamic, it is difficult for the filter 550 to track the channel. It becomes even more difficult to track the echo channel if it contains nonlinear components. Echo cancelers are effectively used in high speed modems and facsimile machines since the echo channel in the PSTN is essentially linear and not very dynamic.
In a vehicle, however, the acoustic echo channel is both dynamic and nonlinear as a result of the loudspeaker, codecs, power amplifiers and other nonlinearities encountered in the acoustic path. Traditional echo cancelers suffer when confronted with nonlinearities in the echo path. Limited resolution in the A/D and D/A converters 510 and 520 respectively, coupled with a course sampling rate, can also degrade performance. Moreover, if the acoustic path is long, the number of taps in filter 550 begins to increase to a point where heavy demands are placed on the processing capacity of most commercially available digital signal processors.
Because of these limitations, acoustic echo cancelers typically provide only limited enhancement in a hands-free vehicular radiotelephone. In most practical cases, an echo return loss enhancement of only 10-20 dB is provided. This is inadequate for echoes with lengthy delays such as experienced in cellular systems. Echo cancellation in vehicle radiotelephone systems is described in the following publications: Acoustic Echo Cancellation for Full-Duplex Voice Transmission of Fading Channels, S. Park et al., Proc. of International Mobile Satellite Conference, Jun. 18-20, 1990; Acoustic Echo Cancellation for Loudspeaker Telephones, W. Hsu et al., IEEE, 1987, pp. 1955-1959; Full-Duplex Speakerphone with Acoustic and Electric Echo-Canceler Utilizing the DSP56200 Cascadable Adaptive FIR Filter Chip, S. Park, Proc. of Midcon/90 Technical Conference on Electronic and Electrical Technology, Sept. 11-13, 1990, pp. 1-5; Echo Cancellation and Applications, K. Murano et al., IEEE Communication Magazine, January 1990, pp. 49-55; and Simulation of an Adaptive Echo Canceler for Carphone Hands-Free Units, J. Noble, UK IT 88 Conference Proceedings, July, 1988, pp. 456-459.
In order to further cancel acoustic echoes in a vehicular radiotelephone system, an echo canceler has been combined with a shallow voice switch of the type described in FIG. 4. Shallow refers to the level of attenuation being 20-30 dB instead of the 40-50 dB normally used with voice switching systems. Such a combination is described, for example, in a publication by Armbruster entitled High Quality Hands-Free Telephony Using Voice Switch Optimized with Echo Cancellation, Signal Processing IV; Theories and Applications, Elsevier Science Publishers B.V., 1988, pp. 495-498. Although an improvement over existing techniques, this solution falls well short of emulating full-duplex communication.