Speech enhancement using microphone arrays is a known in the art technique, in which the microphones are typically arranged in a line for synchronizing the delays thereof according to distance of each microphone from the speaker, such as shown in FIGS. 1-2. In these techniques the output of the microphones is delayed in a controllable manner to allow synchronizing the speaker's speech and eliminating other noise related signals. These techniques require the microphones to be substantially separated from one another i.e. forming a large distance from one another or the delaying is insignificant and cannot be used for speech enhancement.
The formula for a homogenous linear array beam pattern is:
      D    ⁡          (      θ      )        =            sin      ⁡              [                  N          ⁢                                          ⁢          π          ⁢                                          ⁢                                    cos              ⁡                              (                θ                )                                      ·                          λ                              -                1                                              ⁢          d                ]                    sin      ⁡              [                  π          ⁢                                          ⁢                                    cos              ⁡                              (                θ                )                                      ·                          λ                              -                1                                              ⁢          d                ]            and the response function (attenuation in dB) is given in the graph shown in FIG. 2.
Affes et al. (1997) teaches a signal subspace tracking algorithm for microphone array speech processing for enhancing speech in adverse acoustic environments. This algorithm proposes a method of adaptive microphone array beamforming using matched filters with signal subspace tracking for enhancement of near field speech signals by the reduction of multipath and reverberations. This method is mainly targeted at reducing the reflections and reverberations of sound sources that do not propagate along direct paths such as in cases of microphones of hand held mobile devices. The setup that was used in this work by Affes et al. (1997) is discussed at Sec. II.A. Twelve microphones were positioned on the screen of a computer workstation, with spacing of 7 cm between each pair.
Jan et al (1996) teaches microphone arrays and signal processing for high quality sound capture in noisy reverberant enclosures that incorporates matched filtering of individual sensors and parallel processing for providing spatial volume selectivity that mitigates noise interference and multipath distortion. This technique uses randomly distributed transducers.
Capon (1969) teaches a high-resolution frequency-wavenumber spectrum analysis, which is referred to as the minimum variance distortionless response (MVDR) beamformer. This well-known algorithm is used to minimize the noise received by a sensor array, while preserving the desired source without distortion.
U.S. Pat. No. 7,809,145 teaches methods and apparatus for signal processing. A discrete time domain input signal xm(t) is produced from an array of microphones M0 . . . MM. A listening direction may be determined for the microphone array. The listening direction is used in a semi-blind source separation to select the finite impulse response filter coefficients b0, b1 . . . , bN to separate out different sound sources from input signal xm(t). One or more fractional delays may optionally be applied to selected input signals xm(t) other than an input signal x0(t) from a reference microphone M0.
U.S. Pat. No. 8,204,247 teaches an audio system generates position-independent auditory scenes using harmonic expansions based on the audio signals generated by a microphone array. Audio sensors are mounted on the surface of a sphere. The number and location of the audio sensors on the sphere are designed to enable the audio signals generated by those sensors to be decomposed into a set of eigenbeam outputs. Compensation data corresponding to at least one of the estimated distance and the estimated orientation of the sound source relative to the array are generated from eigenbeam outputs and used to generate an auditory scene. Compensation based on estimated orientation involves steering a beam formed from the eigenbeam outputs in the estimated direction of the sound source to increase direction independence, while compensation based on estimated distance involves frequency compensation of the steered beam to increase distance independence.
U.S. Pat. No. 8,005,237 teaches beamforming post-processor technique with enhanced noise suppression capability. The beam forming post-processor technique is a non-linear post-processing technique for sensor arrays (e.g., microphone arrays) which improves the directivity and signal separation capabilities. The technique works in so-called instantaneous direction of arrival space, estimates the probability for sound coming from a given incident angle or look-up direction and applies a time-varying, gain based, spatio-temporal filter for suppressing sounds coming from directions other than the sound source direction resulting in minimal artifacts and musical noise.