The present invention relates to an acoustic echo canceling technology for a telephone conference system provided with speakers and microphones, or for a TV conference system.
There is a telephone conference system or a TV conference system provided with speakers and microphones at both talker-sites, which is capable of providing voice-activated conversation between persons at far ends, by connection with a network. This system had a problem of commingling of a voice output from the speaker to the microphone. Therefore, in the past, it has been carried out to remove a speaker output sound (acoustic echo) commingled into the microphone, using acoustic echo canceller technology. In the case where acoustic environment of a conference room is unchanging, it is possible to completely remove the acoustic echo, by learning a sound transmission way (impulse response) in space only once at the start, and then by using this impulse response. However, when conference participants change their seats or the like, acoustic passes of the acoustic echo vary resulting in mismatch between studied impulse response and practical impulse response, and thus complete removal of the acoustic echo becomes impossible. In the worst case, residual echo repeatedly run around to gradually increase sound level, generates howling phenomenon, and provides a state that conversation is completely impossible to do.
Consequently, such a method has been proposed that aims at removing acoustic echo always and completely, by sequential learning of impulse response so as to follow variation of the acoustic passes (for example, see Peter Heitkamper, “An Adaptation Control for Acoustic Echo Cancellers”, IEEE Signal Processing Letters, Vol. 4, No. 6, 1997/6).
In addition, a method for elimination of acoustic echo using microphone array has been proposed (for example, see JP-A-2005-136701). In conventional technology, because of insufficient performance of an echo canceller, in the case where a near end talker and a far end talker speak at the same time, howling is prevented by setting a one-way communication state by a complete shutout of a voice of a talker with low sound level. However, this one-way communication has a problem of difficulty in conversation.
Reference may be further made to R. O. Schmidt, “Multiple Emitter Location and Signal Parameter Estimation”, IEEE Trans. Antennas and Propagation, Vol. 34, No. 3, pp. 276 to 280, 1986; and Masahito Togami, Akio Amano, Hiroshi Shinjo, Ryota Kamoshida, Junichi Tamamoto, Saku Egawa, “Auditory Ability of Human Symbiosis Robots “EMIEW””, JSAI Technical Report SIG-Challenge-0522-10(10/14), pp 59 to 64, 2005.