Beam-forming technologies as are employed in the audio field, for example, define—in the case of a microphone array, for evaluating the individual signals of the microphones, and in the case of a loudspeaker array, for reproducing the signals of the individual loudspeakers—how the signals are to be subjected to individual filtering by using a respective time-discrete filter. For broadband applications such as music, for example, coefficients are determined for said time-discrete filters from the specification of the optimum frequency responses.
Literature on beam-forming and signal driving almost exclusively deals with the design of the driving weights within the frequency domain. In this context, one implicitly assumes that FIR filters within the time-domain are determined by inverse discrete Fourier transformation (DFT), referred to as FFT. This approach may be interpreted as frequency sampling design [Smi11, Lyo11], a very simple filter design method having various disadvantages: the frequency response of the filters may be indicated, within an equidistant raster, over the entire time-discrete frequency axis up to the sampling frequency. If no sensible definitions can be provided for the frequency response for individual frequency domains (e.g., very low frequencies wherein no satisfactory directional efficiency is possible, or high frequencies wherein no pin-pointed influencing of the emission can take place due to spatial aliasing), there will be the risk that the resulting FIR filters cannot be used (e.g. excessive gain values at specific frequencies due to heavy fluctuations between the frequency sampling points, etc.)
The resulting FIR filters accurately map the defined frequency response within the frequency raster given by the DFT; however, the frequency response may adopt any values between the raster points. This frequently leads to impracticable designs exhibiting intense oscillations of the resulting frequency response.
In addition, in the frequency sampling design, the length of the FIR filter automatically results from the resolution of the defined frequency response (and vice versa).
Filters created by means of frequency sampling design are prone to time-domain aliasing, i.e., to periodic convolution of the impulse responses (e.g., [Smi11]). To this end, additional techniques such as zero-padding of the DFTs or windowing of the generated FIR filters may possibly be used.
An alternative approach consists in determining the FIR coefficients directly within the time-domain in a one-stage process [MDK11]. In this context, the emission behavior of the array for a defined raster of frequencies is represented directly as a function of the FIR coefficients of all transducers (e.g., loudspeakers/microphones) and is formulated as a single optimization problem, the solution of which simultaneously determines the optimum filter coefficients for all beam-forming filters. What is problematic here is the extent of the optimization problem, both with regard to the number of variables to be optimized (filter length multiplied by the number of beam-forming filters) and with regard to the dimension of the defining equations and, possibly, secondary conditions. The latter dimension is typically proportional both to the number of frequency raster points and to the spatial resolution at which the desired beamformer response is established. As a result of this rapidly increasing complexity, this method is limited to arrays having a small number of elements and to very small filter orders. For example, [MSK11] microphone arrays comprising six elements and having a filter length of 8 are used.