1. Technical Field
The invention is related to the calibration of microphone arrays, and more particularly to a system and process for self calibrating a plurality of audio sensors of a microphone array on a continuous basis, while the array is in operation.
2. Background Art
With the burgeoning development of sound recognition software and real-time collaboration and communication programs, the ability to capture high quality sound is becoming more and more important. Using a close-up microphone, such as those installed on a headset, is not very convenient. In addition, hands free sound capture with a single microphone is difficult due to interference with reflected sound waves. In some cases frequencies are enhanced and in others frequencies can be completely suppressed. One emerging technology used to effectively capture high quality sound is the microphone array. A microphone array is made up of a set of microphones positioned closely together, typically in a pattern such as a line or circle. The audio signals are captured synchronously and processed together in such an array.
Localization of sound sources plays important role in many audio systems having microphone arrays. For example, finding the direction to a sound source is used for speaker tracking and post processing of recorded audio signals. In the context of a videoconferencing system, speaker tracking is often used to direct a video camera toward the person speaking. Different techniques have been developed to perform this sound source localization (SSL). Many of these techniques are based on beamsteering.
The beamsteering approach is founded on well known procedures used to capture sound with microphone arrays—namely beamforming. In general, beamforming is the ability to make the microphone array “listen” to a given direction and to suppress the sounds coming from other directions. Processes for sound source localization with beamsteering form a searching beam and scan the work space by moving the direction the searching beam points to. The energy of the signal, coming from each direction, is calculated. The decision as to what direction the sound source resides is based on the direction exhibiting the maximal energy. This approach leads to finding extremum of a surface in the coordinate system direction, elevation, and energy.
However, in many cases microphone arrays used for beamforming or sound source localization do not provide the estimated shape of the beam, noise suppression or localization precision. One of the reasons for this is the difference in the signal paths that is caused by differing sensitivity characteristics among the microphones and/or microphone preamplifiers that make up the array. Still further, existing beamsteering and beamforming procedures used for processing signals from microphone arrays, assume a channel match. This is problematic as even a basic algorithm as delay-and-sum procedure is sensitive to mismatches in the receiving channels. More sophisticated algorithms for beamforming are even more susceptible and often require very precise matching of the impulse response of the microphone-preamplifier-ADC (analog to digital converter) combination for all channels.
The problem is that without careful calibration a mismatch in the microphone array audio channels is hard to avoid. The reasons for the channel mismatch are mostly attributable to looseness in the manufacturing tolerances associated with microphones—even when they are of the same type. The looseness in the tolerances associated with components used in the microphone array preamplifiers introduces gain and phase errors as well. In addition, microphone and preamplifier parameters depend on external factors as temperature, atmospheric pressure, the power supply, and so on. Thus, the degree to which the channels of a microphone array match can vary as these external factors change.
The calibration of microphones and microphone arrays is well known and well studied. Generally, current calibration procedures can be an expensive and difficult task, particularly for broadband arrays. Examples of some of the existing approaches to calibrate microphones in a microphone array include the following.
In one group of calibration techniques, calibration is done for each microphone separately by comparing it with an etalon microphone in specialized environment: e.g., acoustic tube, standing wave tube, reverberationless sound camera, and so on [3]. This approach is very expensive as it requires manual calibration for each microphone, as well as specialized equipment to accomplish this task. As such, this calibration approach is usually reserved for situations calling for microphones used to take precise acoustic measurements.
Another group of existing calibration methods generally employ calibration signals (e.g., speech, sinusoidal, white noise, acoustic pulses, and chirp signals to name a few) sent from speaker(s) or other sound source(s) having known locations [4]. In reference [7], far field white noise is used to calibrate a microphone array of two microphones, where the filter parameters are calculated using a normalized least-mean-squares (NLMS) algorithm. Other works suggest using optimization methods to find the microphone array parameters. For example, in reference [5] the minimization criterion is the speech recognition error. Generally, the methods of this group require manual calibration after installation of the microphone array and specialized equipment to generate test sounds. Thus, they too can be time consuming and expensive to accomplish. In addition, as these calibration methods are done ahead of time, they will not remain valid in the face of changes in the equipment and environmental conditions during operation.
Yet another group of calibration methods involve building algorithms for beamforming and sound source localization that are robust to channels mismatch, thereby avoiding the need for calibration. However, it has been found that in operation the performance and theory of most of these adaptive schemes hinge on an initial high-precision match in the array channels to provide good starting point for the adaptation process [5]. This demands a careful calibration of the array elements prior to their use.
The last group of methods is the self-calibration algorithms. The general approach is described in [1]: i.e., find the direction of arrival (DOA) of a sound source assuming that the microphone array parameters are correct, use DOA to estimate the microphone array parameters, and iterate until the estimates converge. Different methods attempt to estimate different ones of the microphone array parameter, such as the sensor positions, gains, or phase shifts. In additional, different techniques are employed to perform the estimation, ranging from normalized mean square error minimization to complex matrix methods [2] and high-order statistical parameter estimation methods [6]. In some cases the complexity of the estimation algorithms makes them unsuitable for practical real-time implementation due to the fact that they require an excessive amount of CPU power during the normal operation of the microphone array.
It is noted that in the preceding paragraphs the description refers to various individual publications identified by a numeric designator contained within a pair of brackets. For example, such a reference may be identified by reciting, “reference [1]” or simply “[1]”. A listing of references including the publications corresponding to each designator can be found at the end of the Detailed Description section.