This invention pertains to audio signal processing, and specifically to a system and method for crosstalk cancellation.
There are a number of settings in which separate audio signals are prepared for the left and right ears of a listener. Such signals are referred to as binaural signals, and are distinct from stereo signals in that the left and right binaural channels are intended to be heard only by the respective left and right ears of the listener.
Binaural signals are typically used to convey spatial information about the sounds presented. It turns out that a sense of sound source location is created by subtle features imposed on the signals arriving at the left and right ears of the listener [5, 6, 7]. By separately processing left-ear and right-ear signals, as illustrated in FIG. 1, a sound source can be made to appear at any desired location in a listener's perceptual space.
Such synthetic spatial audio—commonly referred to as 3D audio—has application to video games, teleconferencing, and virtual environments, wherein each sound may be processed so as to appear to originate from its generating object. Another 3D audio application is placing “virtual” speakers about a listener, for instance in a standard home theater surround sound configuration as shown in FIG. 2. Here, each of five surround signals 30, 40, 50, 60, 70 is processed according to its location 34, 44, 54, 64, 74 to form left-ear and right-ear signals 32, 42, 52, 62, 72 and 33, 43, 53, 63, 73, which are summed to form the left-ear and right-ear channels 35 and 36 of a binaural signal. Presenting the binaural signal to a listener over headphones gives the impression of a five-speaker surround system, though only the two binaural channels are used.
In all of these applications, headphones or similar transducers are often used to ensure that the left and right binaural channels are delivered, respectively, to the left and right ears of the listener [5, pp. 217-220]. If the binaural signal were played through stereo speakers configured as shown in FIG. 4, each listener ear would hear both binaural channels. This mixing of the left and right binaural channels, called crosstalk, can significantly degrade the spatial cues in the binaural signal, diminishing the listening experience.
There are, however, situations such as in the case of an arcade game where the use of headphones or earphones is impractical, and it is desired to use stereo speakers to present binaural material. In [1], Atal and Schroeder presented a system called a crosstalk canceler for processing a binaural signal to develop a pair of speaker signals that would deliver the original binaural signal to a properly positioned listener.
The system relies on differences among the transfer functions between the two speakers and the two ears. The basic idea is to cancel the crosstalk appearing in the right ear from the left speaker by sending a negative filtered version of the left speaker signal out the right speaker. The filtering is such that the crosstalk from the left speaker and the canceling signal from the right speaker arrive at the right ear simultaneously as negative replicas of each other, and sum to zero. Left ear crosstalk from the right speaker is similarly eliminated.
The crosstalk canceler proposed in [1] can be very effective, but has several drawbacks which limit its usefulness. First, so that the cancellation signal exactly cancels the crosstalk signal, the listener must be carefully positioned at the so-called sweet spot. In addition, the transition between effective cancellation in the sweet spot and no cancellation out of the sweet spot is very abrupt, making it difficult for listeners to find the sweet spot. Consider a 5 kHz signal having a wavelength of about two inches. The listener only need move his head an inch closer to one speaker than the other to turn the perfect cancellation between the crosstalk and canceling signals into perfect reinforcement between the two.
In addition to restricting listener movement, the canceler [1] is sensitive to the shape of the listener's head and ears. To get effective cancellation, particularly at high frequencies, the canceling signal filter should be tailored to the listener.
The second drawback has to do with the timbre or equalization of the canceled signal as compared to that of the original binaural signal. Listeners in the sweet spot sometimes sense that the canceler output is lacking in low-frequency energy compared to the original binaural signal. Listeners away from the sweet spot complain of phase artifacts and a position sensitive equalization. (Note that the apparent equalization away from the sweet spot is important in some applications. For example, consider a television equipped with stereo speakers and virtual surround sound processing as shown in FIG. 3. While the crosstalk canceler can deliver the virtual surround binaural signal to listener 80 in the sweet spot, the crosstalk canceler should not compromise the listening experience of those away from the sweet spot.)
To address the restrictions on listener movement, Cooper and Bauck in [2] proposed a crosstalk canceler which cancels only the low frequencies; the high-frequency portion of the binaural input is sent to the output unchanged. Many audio signals have their energy concentrated below a few kilohertz, so that canceling only those frequencies should not significantly diminish the cancellation effect. Because the wavelengths for the canceled portion of the binaural signal are relatively large, the listener has greater freedom of movement before perceiving a change in cancellation effectiveness. Essentially, the canceler trades a less effective cancellation in the sweet spot for a broader sweet spot.
In [3, 4], Cooper and Bauck present a canceler equalization based on the observation that each canceler has a set of so-called “null canceler” frequencies at which the canceling signal filter is orthogonal to—that is, ±90° out of phase from—the direct signal filter. The proposed equalization inverts the sum of the power in the direct and canceling filters at the null canceler frequencies. This equalization is an improvement over the one implied in [1] in that listeners away from the sweet spot hear few artifacts, and those in the sweet spot experience less of a timber change. However, for certain kinds of source material, a timbre change is still noticeable for listeners in and out of the sweet spot.
Therefore it is an object of the present invention to provide a crosstalk canceler allowing greater listener movement while maintaining effective cancellation, and having an equalization which leaves the input binaural signal uncolored. Another object is to develop a canceler which is insensitive to listener head and ear acoustic properties. It is also an object of the present invention to broaden the transition between effective cancellation in the sweet spot and no cancellation outside the sweet spot to help listeners find the sweet spot. Another object of the present invention is to develop a canceler which is relatively free of artifacts away from the sweet spot. Finally, it is an object of the present invention to adapt the equalization to the input signal so as to minimize timbre changes imposed by the canceler.