Echo can result from two phenomena of different kinds. The first type of echo is known as “line echo”, being restricted to the transmission path and for which various filtering methods are known; the second type of echo is known as “acoustic” echo, being the echo that is actually picked up by the microphone and that is due to the phenomenon of reverberation in the environment of the speaker, typically the room the speaker is in or the cab of a vehicle.
Acoustic echo constitutes a major disturbance for the device, and it can often go so far as to make the speech of the near speaker (the speaker whose speech is embedded in the acoustic echo) to be incomprehensible for the remote speaker (the speaker at the other end of the telephone signal transmission channel).
These appliances include a sensitive microphone for picking up the speech of the near speaker, and a relatively powerful loudspeaker for reproducing the speech of the remote speaker while a telephone conversation is taking place. Nevertheless, because of acoustic coupling between those two transducers, the microphone picks up not only the voice of the near speaker, but also ambient noise and above all the acoustic echo, i.e. the reverberation of the sound reproduced by the loudspeaker—with the level thereof being made all the higher by the microphone and the loudspeaker being close together and with the acoustic power played back by the loudspeaker being high. This applies typically to systems on board a motor vehicle where the sound level for the loudspeaker is relatively high in order to cover ambient noise.
In addition, the considerable distance between the microphone and the near speaker gives rise to noise being at a relative level that is high, thus making it difficult to extract the useful signal that is embedded in the echo and in the noise. Furthermore, the noise presents spectral characteristics that are not steady, i.e. that vary unpredictably as a function of driving conditions: driving over deformed or cobbled roads, car radio in operation, etc., making it even more difficult to develop algorithms suitable for processing the signal.
In addition, many such devices are made in the form of independent appliances that are removable, comprising in a common box both the microphone and the loudspeaker, together with control buttons: the proximity (a few centimeters) of the loudspeaker and the microphone then gives rise to acoustic echo at a level that is considerable, typically of the order of twenty times the level of the speech signal produced by the near speaker.
Eliminating this acoustic echo is particularly difficult, in particular in very noisy environments that are typical of motor vehicle, where the ambient noise is added to the speech signal and the echo signal as picked up the microphone.
Under such circumstances, prior art devices with the best performance implement: i) an echo cancellation stage; ii) an echo suppression stage; and iii) a noise reduction stage.
The echo cancellation stage models a linear transformation for converting the signal from the remote speaker (i.e. the signal that is to be reproduced by the loudspeaker) to the echo picked up by the microphone, so as to define an adaptive filter dynamically for application to the signal from the remote speaker. The result of the filtering is then subtracted from the signal picked up by the microphone, thereby having the effect of canceling the major portion of the acoustic echo.
After processing by the echo cancellation stage, the echo suppression stage serves to suppress the residual echo that remains present by attenuating the residual echo down to the level of the background noise. Whereas echo canceling is implemented essentially by a subtracter stage, echo suppression is performed by controlling gain, so it also acts on the useful signal picked up by the microphone (speech signal from the near speaker).
Finally, the noise reduction stage seeks to reduce the background noise picked up by the microphone, while preserving the speech from the near speaker. This noise reduction is advantageously performed dynamically and adaptively, by discriminating between periods of silence and of conversation in order to identify the noise, and then perform selective de-noising with appropriate attenuation.
JP-A-60 102052 and WO-A-96/26592 describe such circuits that are designed to reduce the incidence of the disturbing acoustic echo.
Nevertheless, those devices do not give complete satisfaction, in particular in appliances where the distance between the loudspeaker and the microphone is very small compared with the distance between the near speaker and the microphone: as a result, when the remote speaker is speaking, that speech is reproduced by the loudspeaker and is picked up in return by the microphone, typically with an echo level that may be as much as twenty times the mean level of the speech from the near speaker.
Furthermore, in particular because of the mobility of present cell phones, it frequently happens that the remote speaker is in an environment that is relatively noisy (street, office, restaurant, train, etc.), where the level of noise may be as much as one-tenth the level of the remote speaker's speech. This noise signal will itself be reproduced by the loudspeaker of the device and it will contribute to the acoustic echo. As a result, the level of such remote noise in the echo is of the same order as the level of speech from the near speaker, or is even higher.
Consequently, even after echo canceling, the residual echo coming from the remote noise (noise picked up beside the remote speaker) is no longer negligible and the echo suppression stage then applies considerable attenuation to the speech signal from the near speaker that is being transmitted to the remote speaker.
It should be observed that unlike speech, noise is present continuously beside the remote speaker (i.e. even when the remote speaker is not speaking), thereby giving rise to quasi-permanent attenuation of the speech signal transmitted from the near speaker to the remote speaker. The result can be improved only when the remote speaker remains silent for long enough to allow the echo cancellation stage to model a linear transformation of the noise signal from the surroundings of the remote speaker.
Furthermore, the echo cancellation stage, which is based on a linear filter, does not model any non-linear phenomena that might occur in the transmission system, in particular in the amplifier and the loudspeaker, nor does it model the electrical noise generated by the analog-to-digital converter circuit. Unfortunately those are phenomena that are not negligible in consumer products of low cost and small size.
Those non-linearities give rise to instability in the echo cancellation algorithm, which needs to re-adapt in a very short length of time.
The components that result from those non-linearities cannot be attenuated by the echo cancellation since they are not modelled. They can be reduced only by the echo suppression stage, thereby degrading the behavior of the device during double talking because of the attenuation of non-echo signals in the voice of the near speaker.