The aim of multi-zone sound reproduction is to provide personalized spatial sound to multiple listeners at the same time. In literature, there are different approaches to multi-zone sound reproduction, which can be divided into two main classes: One class is based on the fact that arbitrary sound fields can be expressed by means of spatial basis functions, i.e., plane waves or cylindrical/spherical harmonics. Other specialized basis functions are also possible, which, however, also need to be approximated by fundamental solutions of the acoustic wave equation in order to allow for their physical reproduction via loudspeakers. A prominent example of sound reproduction on the basis of cylindrical/spherical harmonics is referred to as (higher order) ambisonics. In other applications, the terms modal processing or wave-domain processing are used, which essentially exploit the same idea of describing sound fields by means of basis functions. A fundamental drawback of these techniques is that regular geometries of the transducer arrangement are typically required, such as uniformly spaced circular arrays. Furthermore, infinitely long line sources are often used for the analytic description of real 3D wave fields, which requires an additional correction when it comes to the implementation of a physical setup with real loudspeakers arranged on a 2D plane only.
A second class consists of multi-point approaches, where the sound field is optimized at a multitude of so-called control points within a listening area, typically in the least squares sense. In most cases, the sound field is then expressed in terms of impulse responses or transfer functions between the loudspeakers and the control points of interest. This provides an increased flexibility with respect to the transducer setup, and the utilization of measured Room Impulse Responses (RIRs) allows for a straightforward incorporation of the acoustic characteristics of both real loudspeakers and the reproduction environment. The concepts aim for a mere maximization of the sound energy or its difference between two zones (acoustic contrast). A drawback of this approach is that the orientation of the sound intensity cannot be controlled. This problem can be avoided using pressure matching, where the acoustic pressure is optimized rather than its magnitude square (energy). A combination of pressure matching and energy optimization has been suggested, where a constraint is imposed on the sound energy in order to obtain a desired acoustic contrast between the individual listening areas. All of these approaches have in common that the control points are distributed in the entire interior of the local listening areas. This seems impractical for real setups, where the free-field assumption does not hold and physical microphones are utilized as control points. Also, analytical approaches for synthesizing quiet zones have been presented, but the problem of multi-zone sound generation has not yet been solved satisfactorily.