Any discussion of the background art throughout the specification should in no way be considered as an admission that such art is widely known or forms part of common general knowledge in the field.
In audio conferencing, the source of an audio signal is generally a microphone which is sensitive to acoustic stimulus and other mechanically coupled vibration. Generally for capturing audio from participants in a room, the desired acoustic sources are relatively distant, for example more than 500 mm from the microphones. However, the nature of a compact device with user interface and the possibility of other noise sources located closer to the device creates the potential for very loud and unwanted signals as detected by the microphone. Specifically, mechanical disturbances, such as the physical manipulation of one of the microphones or the operation of any user interface on an audio conference module, can give rise to associated acoustic noise or nuisance audio signals. Although the acoustic noise radiated into the room from such direct mechanical interference may be low and of no concern to those in the room, the interfering noise signals as detected and amplified by the microphone close to that disturbance can significantly affect far end conference participants. Undesired nearby acoustic interference or unwanted mechanical vibration will be picked up by the microphone as noise and then may be preferentially selected over the desired conversation audio.
Techniques are known for suppressing nuisance audio signals in systems of microphones positioned at known locations. However, to the inventors' knowledge, there is no known effective technique for suppressing the presence of nuisance audio on a single microphone module where the audio is not related to the output or echo of a known or well characterized stimulus. In particular, there is no work to the knowledge of the inventors that deals with attempting to work adaptively to optimize the suppression and avoid the loss of desired signal when the interfering nuisance sound is caused by an independent acoustical or mechanical disturbance. This is a particularly important use case where a system is designed to work in proximity to people and activity which may be closer to the capturing device than the desired acoustical object. This is often the case as described in a conference phone placed on a meeting table with people speaking from a range of distance, and some people operating the device or working on the table near the device. U.S. Pat. No. 8,867,757 B1 discloses system including a first housing that houses a plurality of mechanical keys. A first microphone that is configured to detect a dynamic noise is located within the first housing and under the mechanical keys. A second microphone is configured to detect acoustic waves that include speech and to convert the acoustic waves into an electrical audio signal. The dynamic noise is not associated with the detected speech. The system further includes a dynamic audio signal filter that is configured to suppress, in the electrical audio signal, dynamic noise, and the dynamic audio signal filter is activated in response to the first microphone detecting the dynamic noise. US 2010/145689 A1 discloses that an audio signal is received that might include keyboard noise and speech. The audio signal is digitized and transformed from a time domain to a frequency domain. The transformed audio is analyzed to determine whether there is likelihood that keystroke noise is present. If it is determined there is high likelihood that the audio signal contains keystroke noise, a determination is made as to whether a keyboard event occurred around the time of the likely keystroke noise. If it is determined that a keyboard event occurred around the time of the likely keystroke noise, a determination is made as to whether speech is present in the audio signal around the time of the likely keystroke noise. If no speech is present, the keystroke noise is suppressed in the audio signal. If speech is detected in the audio signal or if the keystroke noise abates, the suppression gain is removed from the audio signal. US 2004/213419 A1 discloses various embodiments to reduce noise within a particular environment, while isolating and capturing speech in a manner that allows operation within an otherwise noisy environment. In one embodiment, an array of one or more microphones is used to selectively eliminate noise emanating from known, generally fixed locations, and pass signals from a pre-specified region or regions with reduced distortion.