Spherical microphone arrays offer the ability to capture a three-dimensional sound field. One way to store and process the sound field is the Ambisonics representation. Ambisonics uses orthonormal spherical functions for describing the sound field in the area around the point of origin, also known as the sweet spot. The accuracy of that description is determined by the Ambisonics order N, where a finite number of Ambisonics coefficients describes the sound field. The maximal Ambisonics order of a spherical array is limited by the number of microphone capsules, which number must be equal to or greater than the number 0=(N+1)2 of Ambisonics coefficients.
One advantage of the Ambisonics representation is that the reproduction of the sound field can be adapted individually to any given loudspeaker arrangement. Furthermore, this representation enables the simulation of different microphone characteristics using beam forming techniques at the post production.
The B-format is one known example of Ambisonics. A B-format microphone requires four capsules on a tetrahedron to capture the sound field with an Ambisonics order of one.
Ambisonics of an order greater than one is called Higher Order Ambisonics (HOA), and HOA microphones are typically spherical microphone arrays on a rigid sphere, for example the Eigenmike of mhAcoustics. For the Ambisonics processing the pressure distribution on the surface of the sphere is sampled by the capsules of the array. The sampled pressure is then converted to the Ambisonics representation. Such Ambisonics representation describes the sound field, but including the impact of the microphone array. The impact of the microphones on the captured sound field is removed using the inverse microphone array response, which transforms the sound field of a plane wave to the pressure measured at the microphone capsules. It simulates the directivity of the capsules and the interference of the microphone array with the sound field.
The distorted spectral power of a reconstructed Ambisonics signal captured by a spherical microphone array should be equalized. On one hand, that distortion is caused by the spatial aliasing signal power. On the other hand, due to the noise reduction for spherical microphone arrays on a rigid sphere, higher order coefficients are missing in the spherical harmonics representation, and these missing coefficients unbalance the spectral power spectrum of the reconstructed signal, especially for beam forming applications.
A problem to be solved by the present principles is to reduce the distortion of the spectral power of a reconstructed Ambisonics signal captured by a spherical microphone array, and to equalize the spectral power. This problem is solved by the method disclosed in claim 1. An apparatus that utilizes this method is disclosed in claim 2.
The inventive processing serves for determining a filter that balances the frequency spectrum of the reconstructed Ambisonics signal. The signal power of the filtered and reconstructed Ambisonics signal is analysed, whereby the impact of the average spatial aliasing power and the missing higher order Ambisonics coefficients is described for Ambisonics decoding and beam forming applications. From these results an easy-to-use equalization filter is derived that balances the average frequency spectrum of the reconstructed Ambisonics signal: dependent on the used decoding coefficients and the signal-to-noise ratio SNR of the recording, the average power at the point of origin is estimated.
The equalization filter is obtained from:                Estimation of the signal-to-noise ratio between the average sound field power and the noise power from the microphone array capsules.        Computation per wave number k of the average spatial signal power at the point of origin for a diffuse sound field. That simulation comprises all signal power components (reference, aliasing and noise).        The frequency response of the equalization filter is formed from the square root of the fraction of a given reference power and the computed average spatial signal power at the point of origin.                    Multiplication (per wave number k) of the frequency response of the equalization filter by the transfer function (for each order n at discrete finite wave numbers k) of a noise minimizing filter derived from the signal-to-noise ratio estimation and by the inverse transfer function of the microphone array, in order to get an adapted transfer function Fn,array(k).                        
The resulting filter is applied to the spherical harmonics representation of the recorded sound field, or to the reconstructed signals. The design of such filter is highly computational complex. Advantageously, the computational complex processing can be reduced by using the computation of constant filter design parameters. These parameters are constant for a given microphone array and can be stored in a look-up table. This facilitates a time-variant adaptive filter design with a manageable computational complexity. Advantageously, the filter removes the raised average signal power at high frequencies. Furthermore, the filter balances the frequency response of a beam forming decoder in the spherical harmonics representation at low frequencies. Without usage of the inventive filter the reconstructed sound from a spherical microphone array recording sounds unbalanced because the power of the recorded sound field is not reconstructed correctly in all frequency sub-bands.
In principle, the inventive method is suited for processing microphone capsule signals of a spherical microphone array on a rigid sphere, said method including the steps:                converting said microphone capsule signals representing the pressure on the surface of said microphone array to a spherical harmonics or Ambisonics representation Anm(t);        computing per wave number k an estimation of the time-variant signal-to-noise ratio SNR(k) of said microphone capsule signals, using the average source power |P0(k)|2 of the plane wave recorded from said microphone array and the corresponding noise power |Pnoise(k)|2 representing the spatially uncorrelated noise produced by analog processing in said microphone array;        computing per wave number k the average spatial signal power at the point of origin for a diffuse sound field, using reference, aliasing and noise signal power components,        and forming the frequency response of an equalization filter from the square root of the fraction of a given reference power and said average spatial signal power at the point of origin,        and multiplying per wave number k said frequency response of said equalization filter by the transfer function, for each order n at discrete finite wave numbers k, of a noise minimizing filter derived from said signal-to-noise ratio estimation SNR(k), and by the inverse transfer function of said microphone array, in order to get an adapted transfer function Fn,array(k);        applying said adapted transfer function Fn,array(k) to said spherical harmonics representation Anm(t) using a linear filter processing, resulting in adapted directional coefficients dnm(t).        
In principle the inventive apparatus is suited for processing microphone capsule signals of a spherical microphone array on a rigid sphere, said apparatus including:                means for converting said microphone capsule signals representing the pressure on the surface of said microphone array to a spherical harmonics or Ambisonics representation Anm(t);        means for computing per wave number k an estimation of the time-variant signal-to-noise ratio SNR(k) of said microphone capsule signals, using the average source power |P0(k)|2 of the plane wave recorded from said microphone array and the corresponding noise power |Pnoise(k)|2 representing the spatially uncorrelated noise produced by analog processing in said microphone array;        means for computing per wave number k the average spatial signal power at the point of origin for a diffuse sound field, using reference, aliasing and noise signal power components,        and for forming the frequency response of an equalization filter from the square root of the fraction of a given reference power and said average spatial signal power at the point of origin,        and for multiplying per wave number k said frequency response of said equalization filter by the transfer function, for each order n at discrete finite wave numbers k, of a noise minimizing filter derived from said signal-to-noise ratio estimation SNR(k), and by the inverse transfer function of said microphone array, in order to get an adapted transfer function Fn,array(k);        means for applying said adapted transfer function Fn,array(k) to said spherical harmonics representation Anm(t) using a linear filter processing, resulting in adapted directional coefficients dnm(t).        
Advantageous additional embodiments of the present principles are disclosed in the respective dependent claims.