Individual microphone elements designed for far field audio use can be characterized, in part, by their pickup pattern. The pickup pattern describes the ability of a microphone to reject noise and indirect reflected sound arriving at the microphone from undesired directions. The most popular microphone pickup pattern for use in audio conferencing applications is the cardiod pattern. Other patterns include supercardiod, hypercardiod, and bidirectional.
In a beamforming microphone array designed for far field use, a designer chooses the spacing between microphones to enable spatial sampling of a traveling acoustic wave. Signals from the array of microphones are combined using various algorithms to form a desired pickup pattern. If enough microphones are used in the array, the pickup pattern may yield improved attenuation of undesired signals that propagate from directions other than the “direction of look” of a particular beam in the array.
For use cases in which a beamformer is used for room audio conferencing, audio streaming, audio recording, and audio used with video conferencing products, it is desirable for the beamforming microphone array to capture audio containing frequency information that spans the full range of human hearing. This is generally accepted to be 20 Hz to 20 kHz.
Some beamforming microphone arrays are designed for “close talking” applications, like a mobile phone handset. In these applications, the microphone elements in the beamforming array are positioned within a few centimeters, to less than one meter, of the talker's mouth during active use. The main design objective of close talking microphone arrays is to maximize the quality of the speech signal picked up from the direction of the talker's mouth while attenuating sounds arriving from all other directions. Close talking microphone arrays are generally designed so that their pickup pattern is optimized for a single fixed direction.
Problems with the Prior Art
It is well known by those of ordinary skill in the art that the closest spacing between microphones restricts the highest frequency that can be resolved by the array and the largest spacing between microphones restricts the lowest frequency that can be resolved. At a given temperature and pressure in air, the relationship between the speed of sound, its frequency, and its wavelength is c=λv where c is the speed of sound, λ is the wavelength of the sound, and v is the frequency of the sound.
For professionally installed conferencing applications, it is desirable for a microphone array to have the ability to capture and transmit audio throughout the full range of human hearing that is generally accepted to be 20 Hz to 20 kHz. The low frequency design requirement presents problems due to the physical relationship between the frequency of sound and its wavelength given by the simple equation in the previous paragraph. For example, at 20 degrees Celsius (68 degrees Fahrenheit) at sea level, the speed of sound in dry air is 340 meters per second. In order to perform beamforming down to 20 Hz, the elements of a beamforming microphone array would need to be 340/20=17 meters (55.8 feet) apart. A beamforming microphone this long would be difficult to manufacture, transport, install, and service. It would also not be practical in most conference rooms used in normal day-to-day business meetings in corporations around the globe.
The high frequency requirement for professional installed applications also presents a problem. Performing beamforming for full bandwidth audio may require significant computing resources including memory and CPU cycles, translating directly into greater cost.
It is also generally known to those of ordinary skill in the art that in most conference rooms, low frequency sound reverberates more than high frequency sound. One well-known acoustic property of a room is the time it takes the power of a sound impulse to be attenuated by 60 Decibels (dB) due to absorption of the sound pressure wave by materials and objects in the room. This property is called RT60 and is measured as an average across all frequencies. Rather than measuring the time it takes an impulsive sound to be attenuated, the attenuation time at individual frequencies can be measured. When this is done, it is observed that in most conference rooms, lower frequencies, (up to around 4 kHz) require a longer time to be attenuated by 60 dB as compared to higher frequencies (between around 4 kHz and 20 kHz).