Typical audio conferencing systems use an array of microphones that are fixed in location relative to each other and are synchronized in order to capture the audio of a meeting. In this configuration, sound source localization (SSL) techniques can readily be used to determine the location of a person speaking. Once the sound source is localized, beam-forming can be used to output higher quality audio than if a single microphone was used. Additionally, if a camera is associated with the microphone array, the speaker's video can be displayed in conjunction with the captured audio.
Often, however, the locations of microphones in a meeting room are not fixed or known. For example, meeting participants bring laptops or other computing devices with built-in microphones to a meeting. These laptops or other computing devices are usually wireless network enabled, so they can form an ad hoc network. Compared to traditional microphone array devices, these ad hoc microphone arrays are spatially distributed and the microphones in general are closer to the meeting participants. Thus, higher audio quality can be expected in capturing audio from a speaker (e.g., a person talking), assuming that the microphones used in the mobile computing devices and those in the fixed array devices have the same quality. On the other hand, microphones in an ad hoc arrangement present many challenges. For example, these microphones are not synchronized and the location of these microphones and associated computing devices such as laptop computers is unknown. Additionally, the microphones have different and unknown gains, and their quality is different (i.e., they have different signal to noise ratios). These factors present a problem in capturing a high quality audio recording of a meeting.