The present invention relates to a system for reducing the returned echo in speech communication links with significant signal path delay, such as satellite links. More particularly, it relates to reducing the returned echo in such links resulting from the loudspeaker-to-microphone acoustical coupling in a room teleconference installation.
Coupling from the receive signal path to the transmit signal path at the near end facility (e.g. a telephone network central office termination or a room teleconference installation) of a communication link causes a far-end talker's speech to be returned to him. If the link has significant delay, such as about one-half second round trip for a typical satellite connection, he will perceive this as annoying echo. This echo can make talking difficult or even impossible.
A four-wire (separate transmit and receive signal paths) to two-wire (combined transmit and receive signal paths) conversion hybrid's imperfect balancing to the two wire line at the end-point facility in the telephone network is the usual cause of the unwanted receive path-to-transmit path signal coupling. The signal is typically coupled through this transhybrid path with an average attenuation over the telephone bandwidth (300 to 3 kHz) of 6 to 20 dB.
Coupling from the receive path to the transmit path also occurs in four-wire-only installations such as satellite video-teleconferencing facilities. The coupling occurs acoustically in the room when loudspeaker sound produced by the receive signal reaches open (active) microphones, responsively producing unwanted transmit signal. Due the practical difficulty of obtaining acoustical isolation of the microphones from the loudspeaker, the attenuation from receive to transmit is never more than the minimum required for stability of the overall feedback loop formed by both ends of the conference: 3 to 6 dB at the worst case frequency and 6 to 9 dB averaged over the wider satellite audio bandwidth. Acoustical reflections in the room result in a longer impulse response for the loudspeaker-to-microphone coupling than is the case with the electrical transhybrid path.
Similar techniques have been employed in the prior art to reduce both the electrical coupling and the acoustical coupling and, therefore, the resultant echo perceived by the far-end talker. Most often an "echo suppressor" device has been used. Its basic operation in relation to reducing the hybrid-induced echo has been described by Fang ("Voice Channel Echo Cancellation", IEEE Communications Magazine, Dec. 1983): "An echo suppressor is essentially a voice-operated electronic switch that compares the voice signals traveling in both directions during a long-distance conversation . . . The suppressor decides which person is talking at any given time, and blocks the signal traveling in the opposite direction." These direction and suppression decisions are made with syllabic-based time constants of at least 50 millisecond. While this technique can eliminate echo during single-talk (one end only talking), there is a problematic tradeoff during double-talk (both ends talking) between "chopping" speech and reducing the amount of echo suppression (attenuation) applied.
The suppressor function has also been utilized in "speakerphone" devices to reduce the strong loudspeaker-to-microphone coupling to maintain feedback stability (prevent howling) when connected to the telephone system. The high level of suppression typically required (40 dB or more), even during double-talk, results in choppiness and poor interactivity. These problems are minimized in the teleconference system of Julstrom (U.S. Pat. No. 4,712,231) in conjunction with the automatic microphone control techniques of Anderson et. al. (U.S. Pat. No. 4,489,442) or Julstrom (U.S. Pat. No. 4,658,425), all owned by the same entity as the present application. Multiple directional microphones are automatically "gated ON" (made active) in response to local (near-end) speech, allowing optimized acoustical design, minimized loudspeaker-to-microphone coupling, and reduced amounts of suppression (conversational directional controlled attenuation), possibly down to 0 to 6 dB in ideal rooms. In installations needing higher suppression settings of up to 30 dB, subjective choppiness is minimized as much as possible by the interactive, interrupt-priority direction switching logic.
This system can also be used with delayed communication links, as are present in satellite videoconferences. The microphones remain gated OFF in the absence of local speech, breaking the acoustical coupling path and ensuring the absence of echo during single-talk. However, one or more microphones are gated ON during local speech and for about one-half second thereafter, allowing echo to be returned if far-end speech is received during this time. Even in an acoustically good room requiring only 0 to 6 dB of suppression for feedback loop stability, approximately 24 to 30 dB of total suppression is needed to reduce the returned echo during this extended (by one-half second) double-talk interval to an acceptable level. When combined with the communication link delay, this results in a generally unacceptable degree of choppiness.
Another method of reducing the returned transhybrid signal uses digital signal processing techniques and is known as an "echo canceler". Fang (ibid) describes, "An echo canceler synthesizes a replica of the echo and subtracts it from the returned signal." This can result in typically a 20 dB reduction of the returned echo with no suppression of the outgoing local speech signal. This reduction is not adequate for extended single-talk, though, so the remaining echo is removed by a nonlinear processor called a center-clipper. Again, Fang (ibid) describes, "The center clipper is an energy detector which, in the presence of far-end speech, will set signals below a certain threshold to zero. The threshold can be either fixed at a low level or variable according to the level of the received far-end speech. If received speech is not present, or if near-end speech is present, the center clipper is bypassed." The action of a center clipper with an adaptive threshold is inherently bypassed in the absence of received (far-end) speech. The center clipper causes audible distortion of transmitted local (near-end) speech if not defeated during its presence.
It is important to note that while the threshold of an adaptive center clipper typically has a rapid rise time followed by a much slower fall time which varies at a syllabic or slightly sub-syllabic rate, the attenuation to off of the transmit signal occurs instantaneously based on the magnitude of the transmit signal's waveform in relation to the threshold value. This results in harmonic and intermodulation distortion of any speech present in the transmit signal, but allows more local speech energy to be transmitted than if the signal were suppressed (attenuated with syllabic time constants), thus yielding less choppiness. Since the center clipper totally attenuates the residual transmit signal in the absence of local speech and is attempted to be disengaged in the presence of detected local speech, it would appear that the center clipper could be replaced by a suppressor if the suppressor were also defeated in the presence of local speech. The action of the center clipper is subjectively preferable to the suppressor, however, during the speech presence decision transitions and when the presence of weaker local speech fails to be detected.
The echo canceler techniques have also been applied to the loudspeaker-to-microphone acoustical coupling path. It has generally been difficult or impossible to achieve any useful degree of cancellation outside the laboratory, though, especially if a multiple gated microphone system is used. In practice, however, useful echo reduction is achieved from an echo canceller device through the action of the canceler's center clipper, albeit with the tradeoff of high intermittent distortion, particularly during double-talk. If combined with a gated microphone system, such distortions are absent during single-talk. Due to the digital signal processing employed, such cancelers are narrow bandwidth devices with an upper frequency limit of typically 3 to 7 kHz.
A multi-band center clipper intended to reduce transhybrid coupling as a stand-alone device (not necessarily as part of an echo canceler device) is described by Mitchell and Berkley ("A Full-Duplex Echo Suppressor Using Center-Clipping", The Bell System Technical Journal, May-June 1971). Their device incorporates four to six independent adaptive center clippers operating in as many frequency bands covering the telephone bandwidth. Three sets of bandpass filters are used to (1) develop the multiple thresholds from receive signal measurement, (2) separate the transmit signal into the frequency bands before multiple center clipping of each band, and (3) filter the distortion products from each band following center clipping. The gain constant for each adaptive threshold is manually adjusted for the measured transhybrid coupling in each band. The authors report a virtual elimination of subjective echo provided the thresholds are adjusted for a specific transhybrid coupling or a worst-case coupling. Subjective distortion is reported to be virtually unnoticeable when the thresholds are adjusted for transhybrid coupling attenuations of 15 dB or greater, but becomes increasingly noticeable as the thresholds are adjusted for lower transhybrid coupling attenuations. Additionally, a brief mention is made of the multi-band center clipper's experimental application to "acoustical echo generated in an idealized 4-wire speakerphone", but no results are reported.
None of the prior art discussed above discloses an adequate means of subjectively eliminating returned echo in a teleconference system while not substantially introducing choppiness, adding perceived distortion, or restricting bandwidth.