Such appliances include a sensitive microphone that picks up not only the user's voice, but also the surrounding noise, which noise constitutes a disturbing element that, under certain circumstances, can go so far as to make the speaker's speech incomprehensible. The same applies if it is desired to perform shape recognition voice recognition techniques, since it is difficult to recognize shape for words that are buried in a high level of noise.
This difficulty, which is associated with surrounding noise, is particularly constraining with “hands-free” devices. In particular, the large distance between the microphone and the speaker gives rise to a relatively high level of noise that makes it difficult to extract the useful signal buried in the noise.
Furthermore, the very noisy surroundings typical of the motor car environment present spectral characteristics that are not steady, i.e. that vary in unforeseeable manner as a function of driving conditions: driving over deformed surfaces or cobblestones, car radio in operation, etc.
Some such devices provide for using a plurality of microphones, generally two microphones, and they obtain a signal with a lower level of disturbances by taking the average of the signals that are picked up, or by performing other operations that are more complex. In particular, a so-called “beamforming” technique enables software means to establish directionality that improves the signal-to-noise ratio, however the performance of that technique is very limited when only two microphones are used.
Furthermore, conventional techniques are adapted above all to filtering noise that is diffuse and steady, coming from around the device and occurring at comparable levels in the signals that are picked up by both of the microphones.
In contrast, noise that is not steady, i.e. that noise varies in unforeseeable manner as a function of time, is not distinguished from speech and is therefore not attenuated.
Unfortunately, in a motor car environment, such non-steady noise that is directional occurs very frequently: a horn blowing, a scooter going past, a car overtaking, etc.
One of the difficulties in filtering such non-steady noise stems from the fact that it presents characteristics in time and in three-dimensional space that are very close to the characteristics of speech, thus making it difficult firstly to estimate whether speech is present (given that the speaker does not speak all the time), and secondly to extract the useful speech signal from a very noisy environment such as a motor vehicle cabin.