Over the past few years there have been several publications that deal with the use of spherical microphone arrays. Such arrays are seen by some researchers as a means to capture a representation of the sound field in the vicinity of the array, and by others as a means to digitally beamform sound from different directions using the array with a relatively high order beampattern, or for nearby sources. Variations to the usual solid spherical arrays have been suggested, including hemispherical arrays, open arrays, concentric arrays and others.
A particularly exciting use of these arrays is to steer it to various directions and create an intensity map of the acoustic power in various frequency bands via beamforming. The resulting image, since it is linked with direction can be used to identify source location (direction), be related with physical objects in the world and identify sources of sound, and be used in several applications. This brings up the exciting possibility of creating a “sound camera.”
To be useful, two difficulties must be overcome. The first, is that the beamforming requires the weighted sum of the Fourier coefficients of all the microphone signals, and multichannel sound capture, and it has been difficult to achieve frame-rate performance, as would be desirable in applications such as videoconferencing, noise detection, etc. Second, while qualitative identification of sound sources with real-world objects (speaking humans, noisy machines, gunshots) can be done via a human observer who has knowledge of the environment geometry, for precision and automation the sound images must be captured in conjunction with video, and the two must be automatically analyzed to determine correspondence and identification of the sound sources. For this a formulation for the geometrically correct warping of the two images, taken from an array and cameras at different locations is necessary.