The distribution of a number of loudspeakers relative to the listening position has a large impact on the listening experience and the perceived spaciousness of sound. Often, however, the loudspeakers are not placed in the optimal position since other interior design considerations take higher priority or the desired listening position moves. This can to some extent be compensated for by preprocessing the loudspeaker signals. However, in order to apply the correct preprocessing, the location of the loudspeakers relative to the listening position must be known.
Existing approaches to solving this loudspeaker localization problem can roughly be dichotomized into two groups. In the first group, synthetic test signals such as sinusoidal sweeps or maximum length sequences (MLS) are used as calibration signals. This has the advantage of high estimation accuracy, but also requires the user to actively start the calibration sequence every time, e.g., the listening position or the loudspeaker locations change. This is solved in the second group of methods by adding a calibration signal to the desired audio signal. The calibration signal is shaped psycho-acoustically and hidden inside the audio signal so that it is inaudible to the listener. Consequently, the energy of the calibration signal is low compared to the energy of the audio signal. This is a problem since the audio signal is considered to be “noise” in the source localization algorithm, and this affects the estimation accuracy.
It is also known to use the audio signal for source localization. However, audio signals are much more difficult to work with since they are heavily correlated in both time and in between the loudspeaker channels and have an unknown frequency content. Consequently, it is hard to estimate impulse responses, and the simple cross-correlation methods for loudspeaker localization fail. Synthetic calibration signals, on the other hand, can be designed to be uncorrelated and to have a desirable frequency content. Thus, the simple cross-correlation methods and impulse response peak picking can be used to compute the distances and/or direction of arrivals (DOAs) between the loudspeakers and/or to the listening position.
Document US 2006/0062398 discloses estimation of a distance from a loudspeaker to a microphone using a downsampled adaptive filter to find the impulse response. The microphone is not located in the same place as another loudspeaker.
Document U.S. Pat. No. 8,279,709 discloses localization using only the desired audio signals. Specifically, the case where to estimate the distance between two loudspeakers playing back a stereo music signal. Distances between all the loudspeaker pairs in a set of loudspeakers can be used to form an Euclidean distance matrix to which the positions of the loudspeakers can be fitted using, e.g., the multidimensional scaling (MDS) algorithm or the algorithm by Crocco known from prior art.
In U.S. Pat. No. 8,279,709 it is assumed that a microphone is mounted on every loudspeaker, which is referred to as a transceiver, so that they are approximately co-located. This assumption is used in the proposed estimator of the distance to take into account that both transceivers in a transceiver pair should measure the same distance. This increases the robustness of the estimator.