Spherical microphone arrays offer the ability to capture a three-dimensional sound field. One way to store and process the sound field is the Ambisonics representation. Ambisonics uses orthonormal spherical functions for describing the sound field in the area around the point of origin, also known as the sweet spot. The accuracy of that description is determined by the Ambisonics order N, where a finite number of Ambisonics coefficients describes the sound field. The maximal Ambisonics order of a spherical array is limited by the number of microphone capsules, which number must be equal to or greater than the number 0=(N+1)2 of Ambisonics coefficients.
One advantage of the Ambisonics representation is that the reproduction of the sound field can be adapted individually to any given loudspeaker arrangement. Furthermore, this representation enables the simulation of different microphone characteristics using beam forming techniques at the post production.
The B-format is one known example of Ambisonics. A B-format microphone requires four capsules on a tetrahedron to capture the sound field with an Ambisonics order of one.
Ambisonics of an order greater than one is called Higher Order Ambisonics (HOA), and HOA microphones are typically spherical microphone arrays on a rigid sphere, for example the Eigenmike of mhAcoustics. For the Ambisonics processing the pressure distribution on the surface of the sphere is sampled by the capsules of the array. The sampled pressure is then converted to the Ambisonics representation. Such Ambisonics representation describes the sound field, but including the impact of the microphone array. The impact of the microphones on the captured sound field is removed using the inverse microphone array response, which transforms the sound field of a plane wave to the pressure measured at the microphone capsules. It simulates the directivity of the capsules and the interference of the microphone array with the sound field.
The equalization of the transfer function of the microphone array is a big problem for HOA recordings. If the Ambisonics representation of the array response is known, the impact can be removed by the multiplication of the Ambisonics representation with the inverse array response. However, using the reciprocal of the transfer function can cause high gains for small values and zeros in the transfer function. Therefore, the microphone array should be designed in view of a robust inverse transfer function. For example, a B-format microphone uses cardioid capsules to overcome the zeros in the transfer function of omni-directional capsules.
The present principles are related to spherical microphone arrays on a rigid sphere. The shading effect of the rigid sphere enables a good directivity for frequencies with a small wavelength with respect to the diameter of the array. On the other hand, the filter responses of these microphone arrays have very small values for low frequencies and high Ambisonics orders (i.e. greater than one). The Ambisonics representation of the captured pressure has therefore small higher order coefficients, which represent the small pressure difference at the capsules for wave lengths that are long when compared to the size of the array. The pressure differences, and therefore also the higher order coefficients, are affected by the transducer noise. Thus, for low frequencies the inverse filter response amplifies mainly the noise instead of the higher order Ambisonics coefficients. A known technique for overcoming this problem is to fade out (or high pass filter) the high orders for low frequencies (i.e. to limit there the filter gain), which on one hand decreases the spatial resolution for low frequencies but on the other hand removes (highly distorted) HOA coefficients, thereby corrupting the complete Ambisonics representation. A corresponding compensation filter design that tries to solve this problem using Tikhonov regularization filters is described in Sébastien Moreau, Jérôme Daniel, Stéphanie Bertet, “3D Sound field Recording with Higher Order Ambisonics—Objective Measurements and Validation of a 4th Order Spherical Microphone”, Audio Engineering Society convention paper, 120th Convention 20-23 May 2006, Paris, France, in section 4. A Tikhonov regularization filter minimizes the squared error resulting from the limitation of the Ambisonics order. However, the Tikhonov filter requires a regularization parameter that has to be adapted manually to the characteristics of the recorded signal by ‘trial and error’, and there is no analytic expression defining this parameter. Based on the analysis of spherical microphone arrays of Boaz Rafaely, “Analysis and Design of Spherical Microphone Arrays”, IEEE Transactions on Speech and Audio Processing, vol.13, no.1, pages 135-143, 2005, the present principles show how to obtain automatically the regularization parameter from the signal statistics of the microphone signals.
A problem to be solved by the present principles is to minimize noise, in particular low frequency noise, in an Ambisonics representation of the signals of a spherical microphone array arranged on a rigid sphere. This problem is solved by the method disclosed in claim 1. An apparatus that utilizes this method is disclosed in claim 2.
The processing is used for computing the regularization Tikhonov parameter in dependence of the signal-to-noise ratio of the average sound field power and the noise power of the microphone capsules, i.e. that optimization parameter is computed from the signal-to-noise ratio of the recorded microphone array signals. The computation of the optimization or regularization parameter includes the following steps:                Converting the microphone capsule signals P(Ωc,t) representing the pressure on the surface of said microphone array to a spherical harmonics (or the equivalent Ambisonics) representation Anm(t);        Computing per wave number k an estimation of the time-variant signal-to-noise ratio SNR(k) of the microphone capsule signals P(Ωc,t) using the average source power |P0(k)|2 of the plane wave recorded from the microphone array and the corresponding noise power |Pnoise(k)|2 representing the spatially uncorrelated noise produced by analog processing in the microphone array, i.e. including computing the average spatial power by computing separately a reference signal and a noise signal, wherein the reference signal is the representation of the sound field that can be created with the used microphone array, and the noise signal is the spatially uncorrelated noise produced by the analog processing of the microphone array.        By using a time-variant Wiener filter for each order n designed at discrete finite wave numbers k from the signal-to-noise ratio estimation SNR(k), multiplying the transfer function of the Wiener filter by the inverse transfer function        
  1            b      n        ⁡          (      kR      )      of the microphone array in order to get an adapted transfer function Fn,array(k);                Applying that adapted transfer function Fn,array(k) to the spherical harmonics representation Anm(t) using a linear filter processing, resulting in adapted directional coefficients dnm(t).        
The filter design requires an estimation of the average power of the sound field in order to obtain the SNR of the recording. The estimation is derived from the simulation of the average signal power at the capsules of the array in the spherical harmonics representation. This estimation includes the computation of the spatial coherence of the capsule signal in the spherical harmonics representation. It is known to compute the spatial coherence from the continuous representation of a plane wave, but according to the present principles, the spatial coherence is computed for a spherical array on a rigid sphere, because the sound field of a plane wave on the rigid sphere cannot be computed in the continuous representation. I.e., according to the present principles the SNR is estimated from the capsule signals. The present principles include the following advantages:                The order of the Ambisonics representation is optimally adapted to the SNR of the recording for each frequency sub-band. This reduces the audible noise at the reproduction of the Ambisonics representation.        The estimation of the SNR is required for the filter design. It can be implemented with a low computational complexity by using look-up tables. This facilitates a time-variant adaptive filter design with manageable computational effort.        By the noise reduction, the directional information is partly restored for low frequencies.        
In principle, the method is suited for processing microphone capsule signals of a spherical microphone array on a rigid sphere, said method including the steps:                converting said microphone capsule signals P(Ωc,t) representing the pressure on the surface of said microphone array to a spherical harmonics or Ambisonics representation Anm(t);        computing per wave number k an estimation of the time-variant signal-to-noise ratio SNR(k) of said microphone capsule signals P(Ωc,t), using the average source power |P0(k)|2 of the plane wave recorded from said microphone array and the corresponding noise power |Pnoise(k)|2 representing the spatially uncorrelated noise produced by analog processing in said microphone array;        by using a time-variant Wiener filter for each order n designed at discrete finite wave numbers k from said signal-to-noise ratio estimation SNR(k), multiplying the transfer function of said Wiener filter by the inverse transfer function of said microphone array in order to get an adapted transfer function Fn,array(k);        applying said adapted transfer function Fn,array(k) to said spherical harmonics representation Anm(t) using a linear filter processing, resulting in adapted directional coefficients dnm(t).        
In principle the apparatus is suited for processing microphone capsule signals of a spherical microphone array on a rigid sphere, said apparatus including:                means for converting said microphone capsule signals P(Ωc,t) representing the pressure on the surface of said microphone array to a spherical harmonics or Ambisonics representation Anm(t);        means for computing per wave number k an estimation of the time-variant signal-to-noise ratio SNR(k) of said microphone capsule signals P(Ωc,t), using the average source power |P0(k)|2 of the plane wave recorded from said microphone array and the corresponding noise power |Pnoise(k)|2 representing the spatially uncorrelated noise produced by analog processing in said microphone array;        means for multiplying, by using a time-variant Wiener filter for each order n designed at discrete finite wave numbers k from said signal-to-noise ratio estimation SNR(k), the transfer function of said Wiener filter by the inverse transfer function of said microphone array in order to get an adapted transfer function Fn,array(k);        means for applying said adapted transfer function Fn,array(k) to said spherical harmonics representation Anm(t) using a linear filter processing, resulting in adapted directional coefficients dnm(t).        
Advantageous additional embodiments of the present principles are disclosed in the respective dependent claims.