In telephony, audio signals (e.g. including voice signals) are transmitted between a near-end and a far-end. Far-end signals which are received at the near-end may be outputted from a loudspeaker at the near-end. A microphone at the near-end may be used to capture a near-end signal to be transmitted to the far-end (such as the voice of a speaker at the near-end). An “echo” occurs when at least some of the far-end signal outputted at the near-end is included in the near-end signal which is transmitted back to the far-end. In this sense the echo may be considered to be a reflection of the far-end signal.
An example scenario is illustrated in FIG. 1, which shows a signal being captured by a far-end microphone and output by a near-end loudspeaker. The echo is a consequence of acoustic coupling between the near-end loudspeaker and the near-end microphone; the microphone captures the signal originating from its own loudspeaker in addition to the voice of the near-end speaker and any near-end background noise. The result is an echo at the far-end loudspeaker. Echo cancellation is an important feature of telephony. Hands-free devices and teleconferencing, in particular, require echo cancellation that can adapt to environments having a wide range of acoustic characteristics. In these examples, a combination of factors contributes to echo being more of an issue. First, the volume at which the far-end signal is outputted from the near-end loudspeaker is typically loud enough that the far-end signal is a significant part of the signal captured by the near-end microphone. Second, the physical arrangement of the loudspeaker and microphone in these types of arrangements tends to result in a good acoustic coupling between the two.
Acoustic echo cancellers can be used to remove echo from a microphone signal. They typically model the acoustic echo path and use that model to synthesise an estimate of the echo from the far-end signal. Often, an adaptive filter is used to model the impulse response of the acoustic echo path. The estimated echo is subtracted from the microphone signal to produce a substantially echo-free signal for transmission to the far-end. This technique requires adaptive signal processing to generate a signal accurate enough to cancel the echo effectively.
An environment's acoustic response tends to vary with time, so the adaptive filter in needs to change its model to mimic changes in the real environment—otherwise the adaptive filter's estimate of the echo is likely to differ from the real echo, leading to imperfect echo cancellation. This is usually achieved by updating the adaptive filter's coefficients to take account of any differences between the estimated echo that the adaptive filter is synthesising and the real echo detected by the microphone. The “real echo” is often not available in isolation, as it is just one of several signal components in the microphone signal. To get around this problem, the microphone signal is normally taken to represent the echo during so-called “echo-alone” regions. These are regions in which there is no other significant signal component (such as near-end speech, for example) detected in the microphone signal except ambient background noise which is typically present, even during “echo-alone” regions. The error that is fed back to control the adaptation of the adaptive filter will almost always be at least partially influenced by noise. Therefore, there is a need for an improved mechanism for controlling an AEC.