The invention relates generally to echo cancellers and relates, in particular, to echo cancellers having a wide frequency range.
Echo cancellers are well known in the telephone and communication industries for cancelling echoes created at an electronic junction between receiving and transmitting lines on one side and another set of communication lines on the other side. In practice, the transmitting line is incompletely isolated from the receiving line so that part of the received signal is unintentionally coupled into the transmitting line. The result is perceived as an echo of the received signal on the transmitting line. An example of an echo canceller is an adaptive finite impulse response filter (AFIRF) described by the present inventor in a technical article entitled "Echo Canceller with Adaptive Transversal Filter Utilizing Pseudo-logarithmic Coding" appearing in Comsat Technical Review, Volume 7, Fall 1977 at pages 393-428. The AFIRF is also disclosed in U.S. Pat. No. 4,064,379 to the present inventor.
The AFIRF is a digital filter and is represented schmatically in FIG. 1. The signals are received on a receive line 10 and transmitted on a send line 12. A hybrid 14 both receives and transmits signals and if perfectly operating would completely separate the two signals. However, typically there exists an echo path from the receive line 10 to the send output 16 of the hybrid 14. In order to cancel this echo, an AFIRF 18 is coupled between the receive ends and the lines 10 and 12. An analog-to-digital converter 20 samples the receive line at a rate R, converts the analog signal to digital form, delays it in a delay line 21, and sends the corresponding digital signal through a switch 22 to a shift register 24, called the X register. The delay line 21 emulates the delay of the echo path. The X register contains n+1 shift locations. The X register 24 operates at a rate of nR and its output is fed back to its input through the switch 22. The switch 22 switches between its two positions so that the one sample of oldest information in the X register 24 is shifted out and lost on each complete recirculation and is replaced by a sample of freshly received data from the A/D converter 20.
Another n+1-long shift register 26, called the H register, contains a digital representation of the response of the echo path to an impulse signal. The H register 26 is also operated at the rate of nR and its output is fed back through circuitry, to be described later, to its input. The outputs of the X register 24 and H register 26 are multplied together in a multiplier 28. The outputs of the multiplier 28 are summed by an accumulator 29 over each recirculation or sampling period and then converted to an analog form. As a result, the signal on the receive line 10 is convolved with the impulse response of the echo path to produce a predicted echo on the output of the accumulator 29. This predicted echo on the output of the accumulator 29 is connected to an inverting input of an adder 30. The output 16 of the hybrid 14 is connected to a non-inverting input of the adder 30. As a result, the predicted echo is subtracted from the transmitted signal and the output of the adder 30 should contain an echo-free transmitted signal.
If it is assumed for the purposes of this present discussion that no signal is being intentionally transmitted, then the output of the adder 30 should be zero. A non-zero output of the adder 30 when no signal is being intentionally transmitted indicates that the H register 26 contains an improper impulse response of the echo path. The output of the adder 30 is led to a correction processor 32 as an error signal which the correction processor 32 uses to correct the values of the coefficients being stored in the H register 26. Another stage 34, actually a part of the register 26, is used for the correct recirculation timing in the H register 26. An optional center-clipper 36 can be used to attenuate the low level components of the output of the adder 30 in order to suppress a residual echo that has not been otherwise cancelled. The center-clipper 36 does not significantly affect the higher level signals intended for transmission.
The inventor has described in a technical article entitled "Cancellation of Accoustic Feedback" appearing in Comsat Technical Review, Volume 12, Fall 1982, pg. 319-333, the use of an echo canceller, such as that shown in FIG. 1, to cancel echoes in a room. Room echoes become a particular problem for teleconferencing in which, as illustrated in FIG. 2, a loudspeaker 40 and a microphone 42 are positioned within room 44. An incoming signal is broadcast by the loudspeaker 40 so that all the people 46, 48 and 50 in the room 44 can hear. Any one of the participants 46-50 in the teleconference can talk. A microphone amplifier 52 connected to the microphone 42 is given sufficient gain so that an audible signal is transmitted regardless of the position of the speaker relative to the microphone 42. The problem is that the microphone 42 also picks up the signal from the loudspeaker 40, resulting in an echo.
The echo can be cancelled by an echo canceller 56 of the same form as that shown in FIG. 1. However, as described in the cited reference, an echo canceller used for an acoustic echo in a room should have different parameters from that used for echo cancelling in a normal telephone line. A teleconferencing network is expected to have a greater bandwidth than a normal telephone line, e.g., 5.2 kHz versus 3.3 kHz. Although the average speech levels above 5.5 kHz is more than 26 dB below that at 750 Hz, these higher frequencies contribute substantially to the intelligibility and clarity of the voice signal. Thus it is desirable to extend the channel bandwidth even further to 7.5 kHz for a "commentator quality" channel. Also, the typical reverberation time for a room 44, that is, the time for a sound within a room to damp out, it considerably longer than the time constants associated with telephone sets and other electronic equipment. Room reverberation times are high because of the low sound velocity, which is approximately 1 foot per millisecond. The echo canceller 56 must be able to adequately emulate the response of the echo path over both its bandwidth and reverberation time.
A processing window is defined by the maximum length of the impulse response that an echo canceller can emulate. A 15 kHz sampling rate R is the minimum or Nyquist rate for a 7.5 kHz channel. The processing window is given by n/R where n+1 is the length of the X and H registers 24 and 26. With present technology and available hardware, the number n of coefficients or the length of the registers 24 and 26 has a practical limit of approximately 1000 because of the so called processing noise (i.e. accumulation of round-off, quantization and sampling errors). This limitation is analyzed by Campanella in a technical article entitled "Analysis of an Adaptive Impulse Response Echo Canceller", appearing in Comsat Technical Review, Volume 2, Spring 1972, at pages 1-38. This limitation is not raised when using series-parallel organization for the convolution processor as described in U.S. Pat. No. 4,377,793. With a value of n=1000, the maximum processing window for a commentator quality channel is 66.5 ms. The reverberation time of a room is usually measured in terms of the time T.sub.60 that is required for a signal to damp out by 60 dB. For 20 dB cancellation of the echo, the processing window must be at least one third of T.sub.60. This means that the reverberation time T.sub.60 can be no larger than 220 ms. This condition is often not satisfied for typical rooms.
Thus with prior art techniques and practical technology, either bandwidth or complete echo cancellation must be sacrificed. One approach for obtaining acoustic echo cancellation of a commentator quality channel is to rely upon the power density spectrum of the typical speech which is illustrated in FIG. 3. Curve 60 given the human threshold of audibility for continuous spectra sound. Three generally parallel curves 62, 64 and 66 give the levels of speech minimums, the average level of speech and the level of speech peaks, respectively, as a function of frequency. Thus it is seen that higher frequency components of speech are at a relatively low level. Nonetheless, they must be retained and some form of echo cancellation provided. The technique that relies upon the human speech spectrum provides digital echo cancelling only for the lower frequency components, that is, the sampling rate is relatively low so that 1000 coefficients provides a relatively long processing window. As a result, the high frequency components of the echo are incompletely cancelled. However, FIG. 3 indicates that the high frequency echo is of relatively low level. The center-clipper 36, shown in FIG. 1, is used to remove the residual echo, particularly the high frequency echo which the digital AFIRF 18 does not remove. In order to remove the distortion that the center-clipper 36 would introduce into transmitted speech, a double-talk detector is used to detect when a signal is being intentionally transmitted on the send line 16. The double-talk detector linearizes the response of the center-clipper 36. While a linearized center-clipper 36 does not attenuate the high frequency components of simultaneous echo, these high frequency components are of sufficiently low level compared to the transmit signal that they are not noticeable. That is, the high frequency components of the echo are masked by the transmitted signal.
It should be noted that Brooks in two U.S. Pat. Nos. 3,941,948 and 3,946,170 discloses systems involving center-clipping of one of two bands of a transmission signal and that Bendel in U.S. Pat. No. 3,900,708 discloses separately suppressing echoes in different bands. However the present invention differs in that the more accurate but complex echo cancelling is performed on a low frequency band while echo suppression is performed on the high-frequency band which typically has a substantially lower signal level.
A further problem with a full-bandwidth AFIRF is that the levels of the high frequency components are so small that they approach the quantization noise of eight bit processing whether A-law or U-law quantization of the analog signal is used. The spectral distribution of the speech can effectively be changed by preemphasis or de-emphasis which improves the quantization resolution at the higher frequencies. However, these techniques, besides conflicting with the use of a center-clipper, cannot completely solve the problem of a limited precision of quantization at speech frequencies above 3.5 kHz.