1. Field of the Invention
The present invention relates to echo cancellation in telecommunication systems. More particularly, the present invention relates to a novel apparatus and method for echo cancellation that incorporates a psychoacoustic masking effect.
2. Description of the Related Art
An increasingly popular form of telecommunication is one that allows for "hands-free" operation, such as a speakerphone or teleconferencing equipment. However, these types of telecommunication equipment are susceptible to interference, particularly to, echoes.
In general, there are two types of echoes, electrical and acoustical. An electrical echo is generated when a portion of an electrical signal that represents acoustical information is reflected and returned to its source because of impedance mismatch or discontinuity between a signal source and a transmission line. A typical example of an electrical echo is one that occurs between a four-wire telecommunication circuit, which has two wires for each of a loudspeaker and a microphone, and a two-wire transmission cable. Impedance mismatch often occurs where the two-wire cable connects to the four-wire unit.
An acoustic echo, meanwhile, is generated largely due to the close proximity of a loudspeaker and a microphone of a telecommunication system, as in a speakerphone. In general, acoustic echoes are created when an audio signal from a far-end user of the system is broadcasted by the loudspeaker and picked up by the microphone either directly or indirectly by acoustic reflections off walls of the room in which the loudspeaker and microphone are located. Echoes of the audio signal of the far-end user are combined with an audio signal of a near-end user of the system, also picked up by the microphone, and returned to the far-end user. Echoes of the audio signal of the near-end user may similarly be created at the far-end of the system and returned to the near-end user.
Echoes are annoying to users of the telecommunication system because reverberations of a previously uttered phrase arrive as a new phrase is being uttered. Therefore, echo cancelers are used to remove the echoes or reduce them to an acceptable level. A common type of echo canceler is a subband echo canceler that divides a wideband signal into discrete subbands.
A subband echo canceler removes or reduces echoes in each subband and generally includes a plurality of compensators, also known as adaptive filters, or variable coefficient filters, wherein one or more of the plurality of compensators may be allocated to each of the subbands. The compensators or filters may be digital finite impulse response filters and the filter coefficients are generally updated for each subband by a number of known algorithms, such as the normalized least mean square ("NLMS") algorithm. The plurality of compensators generate a plurality of audio signals, illustratively as artificial acoustic echoes, such that at least one artificial acoustic echo is generated for each of the subbands to cancel the echo therein. A synthesizer recombines the subbands, having echoes reduced or removed therefrom, into a wideband signal and the recombined signal is transmitted to another part of the system.
FIG. 1 shows a block diagram of a conventional subband acoustic echo canceler 1, which includes an input line 22, a first analyzer 13, a plurality of compensators,16-1,16-2 . . . 16-n, a second analyzer 14, a plurality of subtractors, 18-1, 18-2 . . . 18-n, a synthesizer 19, and an output line 24. A first audio signal from a far-end user is transmitted through a transmission line 20 and broadcasted through a loudspeaker 10 of a near-end user receiving the transmission. The first audio signal enters a microphone 12, either directly or indirectly with various time delays from reflecting off walls (not shown) in a room where loudspeaker 10 and microphone 12 are located. Input line 22 is connected to transmission line 20 and transmits the first audio signal to first analyzer 13, which divides the first audio signal into a predetermined number of subbands. First analyzer 13 is connected to the plurality of compensators, 16-1,16-2 . . . 16-n, where n is equal to the number of subbands. Each of the compensators generates an artificial echo in each subband based on a weighted transfer function therein and the corresponding subband signal of the first audio signal. Factors that contribute to the "weights" of the transfer function include the number of subbands, known gross characteristics of acoustic impulse response functions, a number of taps allocated for each subband, and a masking effect of a psychoacoustic model. Each of the plurality of compensators is connected to a corresponding one of the plurality of subtractors, 18-1, 18-2 . . . 18-n, where n is equal to the number of subbands.
Microphone 12 also picks up a second audio signal, which includes echoes of the first audio signal and audio signals from the near-end user.
The second audio signal is transmitted to second analyzer 14 coupled to the plurality of subtractors 18-1, 18-2 . . . 18-n. Analyzer 14 divides the second audio signal into n number of subbands in the frequency domain according to predetermined subband intervals for echo canceler 1. Each second audio signal subband is provided to one of the plurality of subtractors, and the subband is subtracted from one of the artificial echoes generated by the plurality of compensators 16-1, 16-2 . . . 16-n at the corresponding subband frequency. Outputs of the plurality of subtractors, preferably having echoes of the first audio signal removed or greatly reduced, are also known as compensated subband signals and they are provided to synthesizer 19 to recombine the compensated subband signals of the second audio signal. The recombined second audio signal is then transmitted to the far-end user through output line 24.
The effectiveness of an echo canceler may be improved by increasing the number of subbands divided from a signal. Alternatively, there have been various attempts to incorporate the psychoacoustic masking effect into the considerations for updating compensator coefficients. The psychoacoustic masking effect is based on a mathematical model of the masking behavior of the human auditory system wherein an audio signal is imperceptible by human ears when its signal level is below a variable threshold level. This threshold level is frequency dependent and variable according to the presence or absence of other audio signals. In other words, an audio signal may be perceivable under one noise environment and becomes imperceptible under another. These phenomena has been incorporated in many audio and speech compression techniques, including the MPEG-2 standard.
Diethorn describes the incorporation of the psychoacoustic masking effect into an "Optimization of Adaptive Filter Tap Settings for Subband Acoustic Echo Cancelers in Teleconferencing" in U.S. Pat. No. 5,548,642 issued on Aug. 20, 1996. Diethorn describes a subband acoustic echo canceler that incorporates indicia of human perceptual phenomena into an adaptive filter tap allocation table, including predetermined speech power spectra for male and female speakers. The number of taps is fixed, but the allocation of taps among the subbands is "weighted", depending on factors such as the acoustic impulse response of the room in which the echoes are generated and weighting adjustments based on one or more measures of perceived human acoustic sensitivity.