1. Technical Field
The invention is related to finding the direction to a sound source in a prescribed search area using a beamsteering approach with a microphone array, and in particular, to a system and method that provides automatic beamforming design for any microphone array geometry and for any type of microphone.
2. Background Art
Localization of a sound source or direction within a prescribed region is an important element of many systems. For example, a number of conventional audio conferencing applications use microphone arrays with conventional sound source localization (SSL) to enable speech or sound originating from a particular point or direction to be effectively isolated and processed as desired.
For example, conventional microphone arrays typically include an arrangement of microphones in some predetermined layout. These microphones are generally used to simultaneously capture sound waves from various directions and originating from different points in space. Conventional techniques such as SSL are then used to process these signals for localizing the source of the sound waves and for reducing noise. One type of conventional SSL processing uses beamsteering techniques for finding the direction to a particular sound source. In other words, beamsteering techniques are used to combine the signals from all microphones in such a way as to make the microphone array act as a highly directional microphone, pointing a “listening beam” to the sound source. Sound capture is then attenuated for sounds coming from directions outside that beam. Such techniques allow the microphone array to suppress a portion of ambient noises and reverberated waves (generated by reflections of sound on walls and objects in the room), and thus providing a higher signal to noise ratio (SNR) for sound signals originating from within the target beam.
Beamsteering typically allows beams to be steered or targeted to provide sound capture within a desired spatial area or region, thereby improving the signal-to-noise ratio (SNR) of the sounds recorded from that region. Therefore, beamsteering plays an important role in spatial filtering, i.e., pointing a “beam” to the sound source and suppressing any noises coming from other directions. In some cases the direction to the sound source is used for speaker tracking and post-processing of recorded audio signals. In the context of a video conferencing system, speaker tracking is often used for dynamically directing a video camera toward the person speaking.
In general, as is well known to those skilled in the art, beamsteering involves the use of beamforming techniques for forming a set of beams designed to cover particular angular regions within a prescribed area. A beamformer is basically a spatial filter that operates on the output of an array of sensors, such as microphones, in order to enhance the amplitude of a coherent wavefront relative to background noise and directional interference. A set of signal processing operators (usually linear filters) is then applied to the signals form each sensor, and the outputs of those filters are combined to form beams, which are pointed, or steered, to reinforce inputs from particular angular regions and attenuate inputs from other angular regions.
The “pointing direction” of the steered beam is often referred to as the maximum or main response angle (MRA), and can be arbitrarily chosen for the beams. In other words, beamforming techniques are used to process the input from multiple sensors to create a set of steerable beams having a narrow angular response area in a desired direction (the MRA). Consequently, when a sound is received from within a given beam, the direction of that sound is known (i.e., SSL), and sounds emanating from other beams may be filtered or otherwise processed, as desired.
One class of conventional beamforming algorithms attempts to provide optimal noise suppression by finding parametric solutions for known microphone array geometries. Unfortunately, as a result of the high complexity, and thus large computational overhead, of such approaches, more emphasis has been given to finding near-optimal solutions, rather than optimal solutions. These approaches are often referred to as “fixed-beam formation.”
In general, with fixed-beam formation, the beam shapes do not adapt to changes in the surrounding noises and sound source positions. Further, the near-optimal solutions offered by such approaches tend to provide only near-optimal noise suppression for off-beam sounds or noise. Consequently, there is typically room for improvement in noise or sound suppression offered by such conventional beamforming techniques. Finally, such beamforming algorithms tend to be specifically adapted for use with particular microphone arrays. Consequently, a beamforming technique designed for one particular microphone array may not provide acceptable results when applied to another microphone array of a different geometry.
Other conventional beamforming techniques involve what is known as “adaptive beamforming.” Such techniques are capable of providing noise suppression based on little or no a priori knowledge of the microphone array geometry. Such algorithms adapt to changes in ambient or background noise and to the sound source position by attempting to converge upon an optimal solution as a function of time, thereby providing optimal noise suppression after convergence. Unfortunately, one disadvantage of such techniques is their significant computational requirements and slow adaptation, which makes them less robust to wide varieties in application scenarios.
Consequently, what is needed is a system and method for providing better optimized beamforming solutions for microphone arrays. Further, such a system and method should reduce computational overhead so that real-time beamforming is realized. Finally, such a system and method should be applicable for microphone arrays of any geometry and including any type of microphone.