In physics, echo may be defined as the replica produced by the reflection of a wave in its surrounding environment. Such phenomenon may occur in speech telecommunications. In a telephone terminal, acoustic echo is due to the coupling between the loudspeaker and the microphone of the terminals. As a consequence, the microphone of the telephone not only contains the useful speech signal but also contains echo. If no processing is done on the microphone path, the echo signal as well as the near-end speech signals are transmitted to the far-end speaker and the far-end speaker hears a delayed version of his/her own voice. The annoyance due to hearing his/her own voice increases as the level of the echo signal is high and as the delay between the original signal and its echo is high.
In order to guarantee a good speech quality, some processing may be implemented on the microphone path before the transmission can take place. Acoustic echo cancellation algorithms have been largely investigated in the recent years. Approaches to acoustic echo cancellation may include an adaptive filter followed by an echo postfilter. The adaptive filter produces a replica of the acoustic path. This echo path estimate is then used to estimate the echo signal that is picked up by the microphone. In practice, because of mismatch between the echo path and its estimate, typically, some residual echo subsists at the output of the adaptive filter. A postfilter is often used to render echo inaudible. Echo postfilters may include attenuation being gain applied to the error signal from the adaptive echo cancelling. For better double talk performances, this attenuation can be computed in the subband or frequency domain. Nevertheless, performances of single channel echo cancellation may still be limited as there is typically a trade-off between echo suppression during echo-only periods and low level distortion of near-end speech during double-talk periods.
Mobile terminals have historically been designed with one microphone. Hence echo postfiltering solutions used in mobile terminals have been designed and optimized on the base of one microphone observation. Additionally, these solutions may have limited performance in case of low near-end signal to echo ratio (i.e. high echo compared to near-end speech). This limited performance may result in high distortions in the processed near-end speech signals during double-talk periods and therefore in bad communications quality.
Moreover, the single channel echo postfiltering problem has been tackled for decades now and there appears to be no more room for major improvements regarding solutions to the echo postfilter, especially for mobile terminals case where the computational complexity is somehow limited (in comparison to video conferencing terminals for example).
Thus, efficient methods of echo postfiltering or echo suppression are desirable.