Increasing interest in communication media, such as the Internet, electronic presentations, voice mail, and audio-conference communication systems, is increasing the demand for high-fidelity audio and communication technologies. Currently, individuals and businesses are using these communication media to increase efficiency and productivity, while decreasing cost and complexity. For example, audio-conference communication systems allow one or more individuals at a first location to simultaneously converse with one or more individuals at other locations through full-duplex communication lines in real time, without wearing headsets or using handheld communication devices. Typically, audio-conference communication systems include a number of microphones and loudspeakers, at each location, that can be used by multiple individuals for sending and receiving audio signals to and from other locations.
In many audio-conference communication systems, audio signals carry a large amount of data, and employ a broad range of frequencies. Modern audio-conference communication systems attempt to provide clear transmission of audio signals, free from perceivable distortion, background noise, and other undesired audio artifacts. One common type of undesired audio artifact is an acoustic echo. Acoustic echoes can occur when a transmitted audio signal loops through an audio-conference communication system due to the coupling of microphones and speakers at a location.
FIG. 1 shows a schematic diagram of an exemplary, two-location, single channel communication system 100. The communication system 100 includes a near room 102 and a far room 104. Sounds, such as voices, produced in the near room 102 are detected by a microphone 106, and sounds produced in the far room 104 are detected by a microphone 108. The microphones 106 and 108 are transducers that convert the sounds into continuous analog signals that are represented by x(t) and y(t), respectively, where t is time. The microphone 106 can detect many different sounds produced in the near room 102, including sounds output by the loudspeaker 114. An analog return signal produced by the microphone 106 is represented by:y(t)=s(t)+e(x(t))+v(t)where
s(t) is an analog signal representing sounds produced in the near room 102,
v(t) is an analog signal representing noise, or extraneous signals created by disturbances in the microphone or communication channel 110, that, for example, may produces an annoying buzzing sound output from the loudspeaker 116, and
e(x(t)) is an analog signal that represents an acoustic echo.
The acoustic echo e(x(t)) is due to both acoustic propagation delay in the near room 102 and a round-trip transmission delay of the analog sent signal x(t) over the communication channels 110 and 112. Sounds represented by the analog signal y(t) are output from loudspeaker 116 in the far room 104. Depending on the amplification, or gain, in the amplitude of the signal y(t) and the magnitude of the acoustic echo e(x(t)), a person speaking into the microphone 108 in the far room 104 may also hear an annoying, high-pitched, howling sound emanating from loudspeaker 116 as a result of the sound generated by the acoustic echo e(x(t)).
In recent years there has been an increasing interest in developing multichannel audio communication systems in an effort to enhance the audio-conference experience. Multichannel systems employ a plurality of microphones and loudspeakers in the near and far rooms creating a plurality of acoustic echoes, which can be a significant obstacle to effectively deploying multichannel audio-conference communication systems. These methods send signals to microphones in order to obtain impulse response estimates for the room. These impulse responses are convolved with the sent signals to produce approximate acoustic echoes that are subtracted from the return signals. A significant challenge in these multichannel systems is temporal correlation of the excitation signals sent to the loudspeakers to approximate the echo paths. The challenge often manifests itself as an ill-conditioned search for echo path approximations. This problem is called the “non-uniqueness problem” resulting in unstable control algorithms.
A variety of algorithms have been developed to address the non-uniqueness problem. For example, designers and manufacturers have developed methods that employ nonlinear or time-variant functions to uncorrelate excitation signals prior to exciting the loudspeakers. However, these methods often lead to distortions of temporal attributes of the audio signals that ultimately diminish the spatial audio experience. Other methods attempt to approximate the space of echo paths by a finite number of set-theoretic constraints. In general, these methods do not distort the excitations signals, but they do not resolve the non-uniqueness problem, and as a result, these methods are slower to converge and have higher levels of residual echoes.
Although in recent years there have been a number of advances in multichannel communications, designers, manufacturers, and users of multichannel, audio-conference communication systems continue to seek enhancements that reliably remove acoustic echoes from audio signals in real-time and rapidly adapt to the changing conditions at audio-signal-receiving locations.