1. Field of the Invention
The present invention relates to an echo canceller which is capable of solving a problem about an echo or a howling which is caused when a communication is made in a sound reinforcement communication system such as a hands free cellular phone system or a video conference system, and a communication audio processing apparatus using the same.
2. Description of the Related Art
Heretofore, in a sound reinforcement communication system such as a video conference system, a sound picked up by using a microphone of a far end apparatus is sent to a near end apparatus, and is then outputted from a speaker of the near end apparatus. The near end apparatus is also equipped with a microphone. Thus, the near end apparatus is constructed such that a voice given by a speaker of the near end apparatus is sent to the far end apparatus. For this reason, the voices which are outputted from the speakers on the far end side and the near end side are inputted to the microphones, respectively. When no processing is executed for such a voice, this voice is sent to the other party apparatus again. Thus, this situation causes a phenomenon called “an echo” in which a voice produced by a communication party himself/herself is heard from the speaker somewhat late as in an echo. When the echo (feedback component) becomes large, the echo is inputted to the microphone again, and makes a circuit of the system, thereby causing “a howling”.
An echo canceller is known as a device for prevented the echo or howling as described above from being caused. In general, with the echo canceller, an impulse response of a feedback path (echo path) formed by an acoustic coupling or the like between the speaker and the microphone is measured by using an adaptive filter. The impulse response described above is superimposed on a received signal (reference signal) outputted from the speaker, thereby generating a pseudo-echo. Also, the echo or the howling is removed by subtracting the resulting pseudo-echo from an audio signal picked up by using the microphone.
The adaptive filter in related art is composed of a processor having a variable coefficient, and an algorithm in accordance with which a coefficient is determined at any time. That is to say, with the adaptive filter, the variable filter coefficient is adaptively updated in accordance with an algorithm for minimizing a square mean value of an output signal from a subtracter, for example, a least-mean-square (LMS) algorithm. As a result, an echo component (a feedback component of the received signal fed through the feedback path) of the feedback path is estimated. Also, the echo component estimated by the adaptive filter is subtracted from a transmission signal in the subtracter, thereby canceling only the echo component contained in the transmission signal. As a result, none of components, other than the echo component, which are sound-collected by using the microphone (a voice given from the communication party to the microphone, an ambient noise of the circumference, and the like) undergoes any of the losses.
However, the echo may not be perfectly erased by such an echo canceller, and thus the echo which is left after completion of the erasing is heard by a speaker. This echo is called “a residual echo”. The suppression of this residual echo is desired to the carrying out of the sound reinforcement communication system such as the video conference system without feeling a sense of incompatibility. Thus, heretofore, a technique for suitably adjusting a gain of the residual echo depending on the circumstances by executing echo suppressing processing, thereby making the residual echo less noticeable is proposed as the technique for suppressing the echo as described above.
According to this technique, a revaluation amount ε is given by the following expression (1):ε=E[{S(k)−G(k)·Y(k)}2]  (1)
where Y(k) represents an echo cancellation output signal (residual signal) outputted from the echo canceller after execution of the echo removing processing, Er(k) represents a residual echo signal which is more than the echo canceller can remove in the adaptive processing, S(k) represent a transmission sound (disturbing signal) sound-collected by using a microphone, E[ ] means that a short-time mean is obtained, and k represents a frequency. In addition, a filter G(k) for minimizing the revaluation amount ε is obtained based on the above expression (1), so that the transmission sound is emphasized by suppressing the echo.
According to a Wiener Filtering method as one technique for estimating a short-time spectral amplitude (STSA), the filter G(k) for minimizing the revaluation amount ε expressed by the above expression (2) is given by the following expression (2):
                              G          ⁡                      (            k            )                          =                              E            ⁡                          [                                                                                      S                    ⁡                                          (                      k                      )                                                                                        2                            ]                                                          E              ⁡                              [                                                                                                5                      ⁢                                              (                        k                        )                                                                                                  2                                ]                                      +                          E              ⁡                              [                                                                                                Er                      ⁡                                              (                        k                        )                                                                                                  2                                ]                                                                        (        2        )            
Normally, this echo suppressing processing is utilized together with the adaptive filter in many cases.
This echo suppressing processing, for example, is described in a Non-Patent Document of Sumitaka Sakauchi, and Yoichi Haneda; “Study about Non-linear Echo Suppressing Processing Based on Short-Time Spectral Amplitude Estimation”, Proceeding of the 1998 Spring Meeting of The ASJ, The Acoustical Society of Japan, Mar., 1998, pp. 551 to 552,