Binaural audio with loudspeakers (BAL), also known as transauralization, aims to reproduce, at the entrance of each of the listener's ear canals, the sound pressure signals recorded on only the ipsilateral channel of a stereo signal. That is, only the sound signal of the left stereo channel is reproduced at the left ear and only the sound signal of the right stereo channel is reproduced at the right ear. For example, if the source signal was encoded with a head-related transfer function (HRTF) of the listener, or includes the proper interaural time difference (ITD) and interaural level difference (ILD) cues, then delivering the signal on each of the channels of the stereo signal to the ipsilateral ear, and only to that ear, would ideally guarantee that the car-brain system receives the cues it needs to hear an accurate 3-dimensional (3-D) reproduction of a recorded soundfield.
However, an unintended consequence of binaural audio playback through loudspeakers is crosstalk. Crosstalk occurs when the left ear (right ear) hears sounds from the right (left) audio channel, originating from the right speaker (left speaker). In other words, crosstalk occurs when the sound on one of the stereo channels is heard by the contralateral ear of the listener.
Crosstalk corrupts HRTF information and ITD or ILD cues so that a listener may not properly or completely comprehend the soundfield's binaural cues that are embedded in the recording. Therefore, approaching the goal of BAL requires an effective cancellation of this unintended crosstalk, i.e. crosstalk cancellation or XTC for short.
While there are various techniques for effecting some level of crosstalk cancellation (XTC) for a two loudspeaker system, they all have one or more of the following drawbacks:    D1: Severe spectral coloration to the sound heard by the listener, even if that listener is sitting in the intended sweet spot.    D2: Useful XTC levels are reached only at limited frequency ranges of the audio band.    D3: Severe dynamic range loss when the sound is processed through the XTC filter or processor (while avoiding distortion and/or clipping).
The above drawbacks can be seen by analyzing XTC using the most fundamental formulation of the XTC problem—that is by looking at the inverse of the system transfer matrix (as will be shown and discussed below) that describes sound propagation from the loudspeakers to the ears of the listener.
While the technique of constant parameter (non-frequency dependent) regularization, commonly used in XTC filter design to make the inversion of the system transfer matrix better behaved, may alleviate some of Drawback D3, it inherently introduces spectral artifice of its own (specifically, at the expense of reducing the amplitude of the spectral peaks in the inverted transfer matrix, constant-parameter regularization results in undesirable narrow-band artifacts at higher frequencies and a rolloff at lower frequencies at the loudspeakers) and does little to alleviate the other two drawbacks (D1 and D2).
Prior art frequency-dependent regularization, even when coupled with an effective optimization scheme, is not enough to deal away with Drawbacks D1, D2 and D3.
Previous XTC filter design methods based on system transfer matrix inversion (with or without regularization) strive to maintain a flat amplitude vs. frequency response at the ears of the listener by imposing a non-flat amplitude vs frequency response at the loudspeakers (as explained below), which causes a loss in the dynamic range of the processed sound, and, for reasons that will be explained below, leads to a spectral coloration of the sound as heard by the listener, even if the listener is sitting in the intended sweet spot.
Therefore, while previous methods are useful for designing XTC filters that can inherently correct for non-idealities in the amplitude vs frequency response of the playback hardware and loudspeakers, they do not address all of Drawbacks D1, D2 and D3.