Telephones and conference units are commonly used for providing communication to near end and far end participants. These telephony devices include at least one microphone to capture the voice of the near end participants. Examples of such a microphone can be the microphone on a handset of a desktop telephone. In noisy environments, the microphone would pick up far field noise signals in addition to the voice signal, both of which signals get transmitted to the far end. This results in noisy voice signals being heard at the far end speaker. To overcome the noise signal, the near end speaker would have to speak louder so that the far end can hear his/her voice clearly over the noise signal. Clearly, this is inconvenient for the speaker. Furthermore, if the level of noise varies during the call, the speaker's voice may appear too loud or too low at the far end. Thus, the listening experience of far end listeners may be unsatisfactory.
One traditional solution to solving the problem of noise is to use a microphone array in a beamforming configuration. The microphones in the microphone array are arranged with a fixed distance between them. A signal processor coupled to the microphone array aims the audio beamforming in the direction of a speaker, providing directional sensitivity. As a result, sound from the speaker is emphasized, while noise from other directions surrounding the user is de-emphasized. Thus, the signal to noise ratio of the audio signal sent to the far end is improved.
In another solution, a reference microphone is used to capture stationary noise, which is then subtracted from the main microphone signal. Stationary noise is typically sensed over a long period of time (e.g., 1-2 s) by averaging the ambient noise signal generated by the reference microphone. The stationary noise signal is then subtracted from the main microphone signal using digital processing techniques.
The above mentioned techniques are used in several telephony applications for suppressing noise. One such application is a wireless headset, which, for example, uses a Bluetooth link to communicate with a communication device to provide hands free operation to the user. The wireless headset typically includes a microphone and a speaker that can be placed in close proximity with the user's mouth and ear respectively. The wireless headset can be affixed on or around an ear of the user so that the speaker is placed near the ear and the microphone extends to be close to the mouth. The wireless headset collects user's voice with the microphone and wirelessly transmits the voice signal to the communication device, which, in turn, transmits the voice signal to the far end. Furthermore, the communication device receives voice signals from the far end and wirelessly transmits the far end voice signals to the headset, which, in turn, reproduces the voice signal from the speaker.
The wireless headset can include one or more additional microphones to provide noise suppression using beamforming or stationary noise subtraction techniques described above. The noise suppression can be carried out at the headset itself or at the communication device. The additional microphone is typically permanently affixed to the headset, and therefore at a fixed distance from the headset microphone.
However, the inventors recognize a few drawbacks with the above techniques. Beamforming technique is less effective when the number of microphones in the microphone array is reduced. Because of cost and space considerations, mobile phones and wireless handsets can include only a small number of microphones—typically only two. As a result, the directionality of beamforming suffers by including a larger angle of sound sources. Consequently, the speaker's voice signal in addition to other sound source, many of them unwanted, located around the speaker are picked up by the microphones and sent to the far end.
Stationary noise cancellation techniques capture sound sources that are relatively constant over a large period of time. For example, sounds made by fans, machines, etc., which are repetitive can be effectively captured and subtracted using stationary noise sensing techniques. However, instantaneous noise, such as random ambient noise, people talking at a distance, background music, keyboard typing noise, etc. cannot be captured by stationary noise cancellation techniques. In some instances, the duration for which the reference microphone captures sound is reduced to allow capturing near instantaneous noise sources. However, even these techniques fail because the reference microphone signals, while including sounds from noise sources, also include the speaker's voice. Thus, when the reference microphone signals are subtracted from the main microphone signal, the subtraction can also remove some of the voice signal. Clearly, removing the signal of interest from the main microphone signal is undesirable.
The following disclosure addresses these and other drawbacks with noise cancellation and suppression in telephony devices.