A system for generating or reproducing sound-including amplifiers, cables, loudspeakers and room acoustics-will always affect the spectral, transient and spatial properties of the reproduced sound, often in unwanted ways. In particular, the acoustic reverberation of the room where the equipment is placed has a considerable and often detrimental effect on the perceived audio quality of the system. The effect of reverberation is often described differently depending on which frequency region is considered. At low frequencies, reverberation is often described in terms of resonances, standing waves, or so-called room modes, which affect the reproduced sound by introducing strong peaks and deep nulls at distinct frequencies in the low end of the spectrum. At higher frequencies, reverberation is generally thought of as reflections arriving at the listener's ears some time after the direct sound from the loudspeaker itself.
Sound reproduction with very high quality can generally be obtained by using matched sets of high-quality cables, amplifiers and loudspeakers, and by modifying the acoustic properties of the room using for example acoustic diffusers, Helmholtz resonators and acoustically absorbing materials. However, such passive means for improving sound quality are cumbersome, expensive, and sometimes not even feasible.
Other means for improving the quality of sound reproduction systems include active solutions based on digital filtering, often referred to as precompensation, equalization, or dereverberation.
A precompensation filter,  in FIG. 1, is then placed between the original audio signal source and the audio equipment. The dynamic properties of the sound generating system can be measured and modeled by recording the system's response to known test signals at one or several positions in the room. The filter  is then calculated and implemented to compensate for the measured properties of the system, symbolized by  in FIG. 1. In particular it is desirable that the phase and amplitude response of the compensated system is close to a pre-specified ideal response, symbolized by  in FIG. 1, in all measurement positions. In other words, it is required that the compensated sound reproduction y(t) matches the ideal yref(t) to some given degree of accuracy. The pre-distortion generated by the precompensator  is intended to counteract the distortion due to the system , such that the resulting sound reproduction has the sound characteristic of .In order to obtain a precompensator that is robust and practically useful, it is important to realize that the model  may not be a perfect description of the real system, and the recordings of the system responses may contain disturbances due to e.g., background noise. Such measurement and modeling errors can for example be represented by adding a noise signal, e(t) in FIG. 1 to the system, yielding the measured system output ym(t). As will be described in the following, modeling errors and uncertainties about the system can also be included in the model , which is then partly parameterized by random variables with specified probability distributions.
Up to the physical limits of the system, it is thus, at least in theory, possible to attain an improved sound reproduction quality without the high cost of using extreme high-end audio equipment. The aim of the design could, for example, be to cancel acoustic resonances and diffraction effects caused by imperfectly built loudspeaker cabinets. Another application could be to minimize the effect of room modes (i.e., low-frequency resonance peaks and nulls) in different places of the listening room. Yet another aim could be to obtain a pleasant tonal balance and a detailed perceived stereo image.
So far, the established methods for digital precompensation of audio systems that exist on the commercial market and in the scientific literature are mainly single-channel methods, see e.g., [17]. Single-channel precompensation refers to the principle that the input signal to a loudspeaker is processed by a single filter. When single-channel precompensation is applied to a sound system containing more than one loudspeaker channel—for example a 5.1 home cinema system having five wide-band channels and a subwoofer—it means that the filters for different loudspeaker channels are determined individually and independently of each other. The extent to which each compensated loudspeaker actually attains its specified ideal target response in all measurement positions depends mainly on the following two factors:                1. If the impulse response of the loudspeaker and the room is not entirely of minimum phase character, then the compensating filter must be of so-called mixed phase type, in order to correct for the distortion components that are not minimum phase. As nearly all loudspeaker-room impulse responses contain non-minimum phase components [23], a minimum phase filter will be insufficient for compensating the system so that it fully reaches the target response. As the design of mixed-phase filters for audio use is considerably less straightforward than the design of minimum phase filters, most existing products for digital precompensation make use of filters that are restricted to be of minimum phase type.        2. If the impulse response of a loudspeaker varies between different measurement positions, as is normally the case in a room, then a single filter will not be able to fully correct the response of the loudspeaker at all measurement positions due to conflicting requirements at different positions. In an average sense the response of the compensated system may be closer to the target, but due to the spatial variability of the system, there will always be remaining errors at each measurement position. Moreover, if a mixed-phase compensator is used, then errors may occur in the form of so-called “pre-ringings” unless the compensator is designed with great caution [5]. Pre-ringing errors are known to be perceptually much more objectionable than post-ringings. In [5, 6] it is shown how to design a mixed-phase compensator that alleviates the problem of pre-ringing errors, by correcting only for the non-minimum phase distortion that is common to all measurement positions.        
Thus, the method of single-channel compensation has a potential limitation in that it can only correct the impulse and frequency responses in an average sense when multiple measurement positions are considered. In an acoustic environment where the original response of a loudspeaker varies a lot between measurement positions, this variability will remain also in the responses of the compensated loudspeaker, although the compensated system's performance on average is closer to the target performance. Moreover, designing a compensator with respect to only one measurement position is not a realistic option because it is well known that single-point designs yield filters that are extremely non-robust and degrade the system's performance at all other positions in the room [13, 14].
It can thus be concluded that single-channel precompensation methods are most effective for correcting degradations that are systematic over the spatial region of interest, i.e., distortion components that are common, or at least nearly common, to all measurement positions. Typically, such systematic degradations are caused by the loudspeaker itself, or by reflecting surfaces very close to the loudspeaker, or by the room acoustics at low frequencies, where the wavelength is large compared to the region of interest. If a sound reproduction system, including its acoustic environment, is such that its spatially varying distortion dominates over its spatially common distortion, then the sound quality improvement offered by single-channel methods is unfortunately rather small.
Considering the above, one may ask whether a precompensation strategy of higher performance can be obtained, for example by using loudspeakers and filter structures in a more flexible way than what is suggested by the established single-channel methods. In the acoustics-related research literature, a few different strategies that go beyond traditional single-channel filtering have been identified [2, 7, 9, 10, 11, 12, 18, 21, 22, 24, 25, 29, 33, 34]. In summary, the known methods can be grouped into the following categories.                1. The methods in the first category are based on physical insight about room acoustics and particularly the acoustic coupling between loudspeakers and the low-frequency resonance modes of the room. It is well known that a carefully selected physical placement of loudspeakers and the use of several subwoofers are helpful to reduce the effect of room modes [34].        2. Another principle is the source-sink method [7, 8, 33] where the room modes are reduced by positioning a number of subwoofers symmetrically in the room, whereafter delay-, gain- and phase adjustments are applied to the different subwoofer channels. According to this method, the subwoofers at the front wall of the room act as sources of sound, whereas the delay-, gain- and phase adjusted subwoofers at the rear wall act as sinks, i.e., absorbers of sound, which cancel the low-frequency reflections from the rear wall. The method is, however, restricted to work only on the lowest part of the spectrum (below 150 Hz), and the type of adjustments made to the subwoofer signals are very primitive.        3. A third important method is modal equalization [16, 21], in which the modal resonances and their decay times are equalized by digital prefilters. This method involves an explicit identification of the center frequencies and decay times of single room modes, and it is limited to work at very low frequencies (typically only below 200 Hz) where the room resonances are assumed to be distinct and well separated on the frequency axis. Reference [16] discusses two possible approaches, Type I which is a single-channel equalizer and Type II which uses two or more channels for canceling the room modes. It is acknowledged in [16] that the filter design for Type II modal equalization is not straightforward when more than two channels are used, and an explicit solution to the multichannel design case is not presented. Altogether, the approach is unsatisfactory since it relies on assumptions that are in general not fulfilled in a typical room, for example that all modes subject to equalization are well separated and estimable with high precision.        4. A fourth category of methods is based on multichannel filter design under various objectives. One objective is active noise control, where the sound from one or several loudspeakers are used to cancel unwanted acoustic disturbances, see e.g., [11]. A second objective is to obtain an exact reproduction of specific sound pressures in a small number of spatial positions, typically the positions of the ears of a human listener. This approach is often referred to as crosstalk cancellation, virtual acoustic imaging, or transaural stereo [2, 22, 24, 25]. A drawback of this approach is that its performance is extremely sensitive to small movements of the listener, and it is particularly nonrobust in normal reverberant rooms. A third common objective relates to “holophonic” audio rendering techniques such as Wave Field Synthesis (WFS) and High Order Ambisonics (HOA) [10, 28, 30], which aim at reproducing arbitrary sound fields over large regions in two or three dimensions, using massive loudspeaker arrays of 50 or more loudspeakers. A number of multichannel filter designs have been proposed in order to improve the performance of WFS, HOA and related techniques, see e.g., [9, 12, 18, 29]. A fourth objective concerns the minimization of destructive phase interaction in the cross-over frequency region, between subwoofer and satellite loudspeakers in sound systems employing so-called bass management [3]. These mentioned multichannel filter designs are not suitable as solutions to the general loudspeaker precompensation problem. First, they are significantly different in their objectives compared to the single-channel precompensation methods. Second, the proposed computational methods yield filters with unsatisfactory properties. For example, most methods design filters in the frequency domain without regard to broadband filter behavior such as causality, the maximum allowed delay through the system and the level and duration of pre-ringing errors.        
None of the multichannel filter design methods in the prior art are useful for the purpose of robust wide-band loudspeaker/room compensation of an existing loudspeaker set-up for stereo or multichannel audio reproduction.