In multichannel listening, the listener is surrounded with multiple loudspeakers. A variety of known methods exist to capture audio for such setups. Let us first consider loudspeaker systems and the spatial impression that can be created with them. Without special techniques, common two-channel stereophonic setups can only create auditory events on the line connecting the loudspeakers. Sound emanating from other directions cannot be produced. Logically, by using more loudspeakers around the listener, more directions can be covered and a more natural spatial impression can be created. The most well known multichannel loudspeaker system and layout is the 5.1 standard (“ITU-R 775-1”), which consists of five loudspeakers at azimuthal angles of 0°, 30° and 110° with respect to the listening position. Other systems with a varying number of loudspeakers located at different directions are also known.
In the art, several different recording methods have been designed for the previously mentioned loudspeaker systems, in order to reproduce the spatial impression in the listening situation as it would be perceived in the recording environment. The ideal way to record spatial sound for a chosen multichannel loudspeaker system would be to use the same number of microphones as there are loudspeakers. In such a case, the directivity patterns of the microphones should also correspond to the loudspeaker layout such that sound from any single direction would only be recorded with one, two, or three microphones. The more loudspeakers are used, the narrower directivity patterns are thus needed. However, such narrow directional microphones are relatively expensive, and have typically a non-flat frequency response, which is not desired. Furthermore, using several microphones with too broad directivity patterns as input to multichannel reproduction results in a colored and blurred auditory perception, due to the fact that sound emanating from a single direction is usually reproduced with more loudspeakers than is useful. Hence, current microphones are best suited for two-channel recording and reproduction without the goal of a surrounding spatial impression.
Another known approach to spatial sound recording is to record a large number of microphones which are distributed over a wide spatial area. For example, when recording an orchestra on a stage, the single instruments can be picked up by so-called spot microphones, which are positioned closely to the sound sources. The spatial distribution of the frontal sound stage can, for example, be captured by conventional stereo microphones. The sound field components corresponding to the late reverberation can be captured by several microphones placed at a relatively far distance to the stage. A sound engineer can then mix the desired multichannel output by using a combination of all microphone channels available. However, this recording technique implies a very large recording setup and hand crafted mixing of the recorded channels, which is not always feasible in practice.
Conventional systems for the recording and reproduction of spatial audio based on directional audio coding (DirAC), as described in T. Lokki, J. Merimaa, V. Pulkki: Method for Reproducing Natural or Modified Spatial Impression in Multichannel Listening, U.S. Pat. No. 7,787,638 B2, Aug. 31, 2010 and V. Pulkki: Spatial Sound Reproduction with Directional Audio Coding. J. Audio Eng. Soc., Vol. 55, No. 6, pp. 503-516, 2007, rely on a simple global model for the sound field. Therefore, they suffer from some systematic drawbacks, which limits the achievable sound quality and experience in practice.
A general problem of known solutions is that they are relatively complex and typically associated with a degradation of the spatial sound quality.