The invention concerns a technological solution targeted for communication systems where there is a feedback between outgoing and incoming signals, such as e.g. in a conference telephone where the outgoing loudspeaker signal is picked up by the microphone. This invention is intended to reduce the negative effects of such feedback.
This type of unwanted feedback may occur in different types of communication devices, particularly in hands-free full-duplex communication devices in which the microphone is or may be positioned such that signals output by the loudspeaker are easily picked up by the microphone. Examples of such communication devices are hands-free conference telephones, hands-free car telephone systems, installed room systems using ceiling speakers and table microphones, conventional telephones or mobile telephones in speakerphone or hands-free mode, etc.
When describing, e.g., a hands-free conference telephone during an active conference call, the A-side (near-end side) commonly denotes the physical room in which the conference telephone is placed, and the B-side (far-end side) commonly denotes the physical location of the other participating part of the conference call. The A-talker is located at the A-side and the speech of the A-talker is picked up by the microphone of the conference telephone, processed and then sent through the telephone network to the B-side and the B-talker. The speech of the B-talker is sent over the telephone network and received by the conference telephone, which processes the received B-talk speech and presents it on the loudspeaker on the A-side.
In a scenario like the one described above, two types of echoes are present. First, in addition to the B-talker signal, the conference telephone may receive a delayed line echo of the A-talker speech possibly due to echoes generated in the telephone network. Then, due to room acoustics, there will be an acoustic echo present on the microphone when the speech of the B-talker is present on the loudspeaker. To remove the echoes is of outmost importance due to both listening comfort and system stability (to prevent so-called howling).
The echoes are typically removed through damping, cancellation or a combination of both damping and cancellation. The damping solution is relatively simple, but will in situations where both A-talker and B-talker are speaking simultaneously only let one speaker through. This is called a half-duplex solution. Echo cancellation, on the other hand, typically uses one or more adaptive filters to model the echo, which is then subtracted from the microphone signal without disturbing the desired speech. This allows speech from both the A-talker and the B-talker simultaneously, denoted full-duplex. In practice however, echo cancellation will not completely remove the echo. Thus, a combination of echo cancellation and damping (to remove the non-cancelled residual echo) is frequently used.
How much the residual echo should be damped depends on the situation, but generally is a function of the speech-to-echo-ratio. A signal containing strong speech combined with weak echo should not be damped as much as a signal with weak speech combined with strong echo, since the speech will in a sense mask the echo. Moreover, to be able to achieve high listener comfort, the speech should be as unaffected by the damping as possible. Determining the speech-to-echo-ratio in a signal is a non-trivial problem.
The problem can also be formulated as differentiating between double-talk and an echo path change. A double-talk situation occurs when both A-side and B-side speakers are active simultaneously. In the double-talk situation, the signal after echo cancellation will be a combination of residual echo and speech, i.e. the signal will contain more energy than a signal with pure residual echo. In an echo-path change situation, the feedback properties will change. This can occur due to changes in the acoustic environment (e.g. people or objects are moving on the A-side) or changes in the telephone network (e.g. a call is being set up). The adaptive echo cancelling filter will then produce a larger residual echo until it has had time to adapt to the change. Hence, in both the double-talk and the echo-path change case the output energy from the echo canceller will increase. In the double-talk situation the damping should be restricted, whereas significant damping should be applied in the echo-path change situation. One problem is thus how to distinguish a double-talk situation from an echo-path change situation. Another problem is that adaptive echo cancelling filters sometimes act unpredictably in, and immediately after, double-talk situations. This makes it difficult to assess the correct amount of echo, potentially leading to underestimation of the echo present in these situations. The risk of underestimating the amount of echo present calls for a safety margin when calculating the speech-to-echo ratio in order to minimize the risk of detecting the echo as near-end speech. A drawback of the safety margin is of course that it complicates the detection of true near-end speech.
Significant for distinguishing double-talk from echo-path change, and also for other applications, is the ability to estimate the stationary noise level and the coupling (feedback) factor (i.e. the strength of the echo). A common method to achieve noise estimation is based on minimum statistics, as described in e.g. “Acoustic Echo and Noise Control: A Practical Approach” by E. Hänsler and G. Schmidt, Wiley, 2004, and in “A Combined Implementation of Echo Suppression, Noise Reduction and Comfort Noise in a Speaker Phone Application” by C. Schüldt, F. Lindstrom and I. Claesson, In Proceedings of IEEE International Conference on Consumer Electronics, Las Vegas, Nev., Jan. 2007. Estimation of the coupling factor can be achieved through e.g. the ratio of the estimated loudspeaker and microphone power, or be extracted from the near-end part of the adaptive filter coefficients. More details of how to estimate the coupling factor can be found in e.g. “Step-size control for acoustic echo cancellation filters—an overview” by A. Mader, H. Puder, G. U. Schmidt, Signal Processing, vol. 80, no. 9, pp. 1697-1719, 2000.
The differentiation between double-talk and echo path change is also crucial for avoiding divergence of adaptive echo cancelling filters, which can occur during double-talk. Thus, the filter adaption should be halted during double-talk. If a single adaptive filter is used and an echo-path change is mistaken for a double-talk situation, the adaptive filter will not update, leading to a dead-lock situation. A structure for avoiding the dead-lock problem is the so called two-path algorithm, where two adaptive echo cancelling filters are used in parallel. This structure is described in more detail in “Echo canceller with two echo path models” by K. Ochiai, T. Araseki, and T. Ogihara, IEEE Transactions on Communications, vol. COM-25, no. 6, pp. 8-11, June 1977. One filter, often referred to as the background filter, is continuously (i.e. very frequently) adapted whereas the other filter, often referred to as the foreground filter, is adapted much less frequently. For this reason, the foreground filter is sometimes referred to as a “fixed” filter. The foreground filter, or the “fixed” filter, is the filter producing the output used for echo cancellation and adaption of the foreground filter is performed by copying the frequently adapting background filter into the foreground filter when the background filter is considered to perform better in terms of echo cancellation. This is what happens in an echo path-change situation. In a double-talk situation on the other hand, the background filter will diverge. However, this will not affect the system output since the fixed foreground filter is providing the output.
The above-discussed conventional solution suffers from drawbacks which, in situations depending on the particular solution, make it difficult to determine which level of damping should be applied to the residual echo in communication devices. There is thus a need for an alternative solution for controlling the damping of residual echo in communication devices.