This invention pertains to audio signal processing, and specifically to a system and method for crosstalk cancellation.
There are a number of settings in which separate audio signals are prepared for the left and right ears of a listener. Such signals are referred to as binaural signals, and are distinct from stereo signals in that the left and right binaural channels are intended to be heard only by the respective left and right ears of the listener.
Binaural signals are typically used to convey spatial information about the sounds presented. It turns out that a sense of sound source location is created by subtle features imposed on the signals arriving at the left and right ears of the listener [5, 6, 7]. By separately processing left-ear and right-ear signals, as illustrated in FIG. 1, a sound source can be made to appear at any desired location in a listener""s perceptual space.
Such synthetic spatial audioxe2x80x94commonly referred to as 3D audioxe2x80x94has application to video games, teleconferencing, and virtual environments, wherein each sound may be processed so as to appear to originate from its generating object. Another 3D audio application is placing xe2x80x9cvirtualxe2x80x9d speakers about a listener, for instance in a standard home theater surround sound configuration as shown in FIG. 2. Here, each of five surround signals 30, 40, 50, 60, 70 is processed according to its location 34, 44, 54, 64, 74 to form left-ear and right-ear. signals 32, 42, 52, 62, 72 and 33, 43, 53, 63, 73, which are summed to form the left-ear and right-ear channels 35 and 36 of a binaural signal. Presenting the binaural signal to a listener over headphones gives the impression of a five-speaker surround system, though only the two binaural channels are used.
In all of these applications, headphones or similar transducers are often used to ensure that the left and right binaural-channels are delivered, respectively, to the left and right ears of the listener [5, pp. 217-220]. If the binaural signal were played through stereo speakers configured as shown in FIG. 4, each listener ear would hear both binaural channels. This mixing of the left and right binaural channels, called crosstalk, can significantly degrade the spatial cues in the binaural signal, diminishing the listening experience.
There are, however, situations such as in the case of an arcade game where the use of headphones or earphones is impractical, and it is desired to use stereo speakers to present binaural material. In [1], Atal and Schroeder presented a system called a crosstalk canceler for processing a binaural signal to develop a pair of speaker signals that would deliver the original binaural signal to a properly positioned listener.
The system relies on differences among the transfer functions between the two speakers and the two ears. The basic idea is to cancel the crosstalk appearing in the right ear from the left speaker by sending a negative filtered version of the left speaker signal out the right speaker. The filtering is such that the crosstalk from the left speaker and the canceling signal from the right speaker arrive at the right ear simultaneously as negative replicas of each other, and sum to zero. Left ear crosstalk from the right speaker is similarly eliminated.
The crosstalk canceler proposed in [1] can be very effective, but has several drawbacks which limit its usefulness. First, so that the cancellation signal exactly cancels the crosstalk signal, the listener must be carefully positioned at the so-called sweet spot. In addition, the transition between effective cancellation in the sweet spot and no cancellation out of the sweet spot is very abrupt, making it difficult for listeners to find the sweet spot. Consider a 5 kHz signal having a wavelength of about two inches. The listener only need move his head an inch closer to one speaker than the other to turn the perfect cancellation between the crosstalk and canceling signals into perfect reinforcement between the two.
In addition to restricting listener movement, the canceler [1] is sensitive to the shape of the listener""s head and ears. To get effective cancellation, particularly at high frequencies, the canceling signal filter should be tailored to the listener.
The second drawback has to do with the timbre or equalization of the canceled signal as compared to that of the original binaural signal. Listeners in the sweet spot sometimes sense that the canceler output is lacking in low-frequency energy compared to the original binaural signal. Listeners away from the sweet spot complain of phase artifacts and a position sensitive equalization. (Note that the apparent equalization away from the sweet spot is important in some applications. For example, consider a television equipped with stereo speakers and virtual surround sound processing as shown in FIG. 3. While the crosstalk canceler can deliver the virtual surround binaural signal to listener 80 in the sweet spot, the crosstalk canceler should not compromise the listening experience of those away from the sweet spot.)
To address the restrictions on listener movement, Cooper and Bauck in [2] proposed a crosstalk canceler which cancels only the low frequencies; the high-frequency portion of the binaural input is sent to the output unchanged. Many audio signals have their energy concentrated below a few kilohertz, so that canceling only those frequencies should not significantly diminish the cancellation effect. Because the wavelengths for the canceled portion of the binaural signal are relatively large, the listener has greater freedom of movement before perceiving a change in cancellation effectiveness. Essentially, the canceler trades a less effective cancellation in the sweet spot for a broader sweet spot.
In [3, 4] Cooper and Bauck present a canceler equalization based on the observation that each canceler has a set of so-called xe2x80x9cnull cancelerxe2x80x9d frequencies at which the canceling signal filter is orthogonal toxe2x80x94that is, xc2x190xc2x0 out of phase from-the direct signal filter. The proposed equalization inverts the sum of the power in the direct and canceling filters at the null canceler frequencies. This equalization is an improvement over the one implied in [1] in that listeners away from the sweet spot hear few artifacts, and those in the sweet spot experience less of a timber change. However; for certain kinds of source material, a timbre change is still noticeable for listeners in and out of the sweet spot.
An embodiment of the present invention provides a crosstalk canceler allowing greater listener movement while maintaining effective cancellation, and having an equalization which leaves the input binaural signal uncolored. An embodiment of the present invention provides a canceler that is insensitive to listener head and ear acoustic properties. An embodiment of the present invention broadens the transition between effective cancellation in the sweet spot and no cancellation outside the sweet spot to help listeners find the sweet spot. An embodiment of the present invention develops a canceler that is relatively free of artifacts away from the sweet spot. An embodiment of the present invention adapts the equalization to the input signal so as to minimize timbre changes imposed by the canceler.
To provide greater listener freedom of movement, the basic idea is to cancel different frequency bands at different locations, rather than to cancel all frequency bands at the same location as is currently practiced. In this way, changes in listener position do not eliminate cancellation, but shift the part of the signal canceled. In addition, this widening of the sweet spot creates a smooth transition between regions of effective cancellation and no cancellation.
The expectation in canceling different frequency bands at different locations is that while the set of listener positions where some cancellation occurs is broader, the cancellation is everywhere less effective than at the sweet spot of a traditional canceler. That the sweet spot of the new canceler is larger than that of traditional cancelers was verified in listening tests using virtual surround sound, speaker spreader, and one-channel signals as the binaural input. Surprisingly, the inventive canceler was perceived to have nearly as effective cancellation in the sweet spot as the traditional canceler.
In analyzing the signal arriving at a listener""s ears from a traditional canceler, it was discovered that unless the listener is precisely positioned, the signal arrives with a timbre change compared to the original binaural signal, irrespective of the cancellation effectiveness. A similar timbre change appears when the acoustic characteristics of the listener""s head and ears are not those used in designing the crosstalk canceler, regardless of listener position.
The inventive canceler has an equalization which takes into account the signal arriving at the ears of a variety of listeners positioned in a range of locations. The inventive equalization is the one minimizing the timbre change over an expected range of listener positions and listener acoustic characteristics. Whereas the power spectrum of the traditional crosstalk canceler equalization has a number of peaks and valleys, that of the inventive equalization is by comparison smooth.
The timbre of output from cancelers using the inventive equalization, in fact, is less sensitive to listener position or acoustic properties than is that from the traditional canceler [1]. In addition, the inventive equalization has the unexpected benefit of reducing artifacts for listeners outside the sweet spot.
Finally, it was noted that binaural signals having a large monophonic component seemed to require an equalization with more bass emphasis than did binaural signals with a small monophonic component. Based on this observation, a canceler equalization was developed which depends on the percentage of monophonic signal energy in the input binaural signal. In this way, the canceler equalization may be adapted to the binaural input.
One embodiment of the invention is a crosstalk canceler providing greater listener freedom of movement comprising an input audio signal, two output channels, and a network of filters designed to eliminate crosstalk at the ear of a listener at different listener positions for different frequency bands of the input audio signal.
Another embodiment of the invention is a crosstalk canceler equalization which is less sensitive to listener acoustic characteristics and listener position, said equalization being a spectrally smooth version of an input equalization, the details of which may be optionally determined by anticipated ranges of listener acoustic characteristics and listener positions.
An additional embodiment of the invention is a crosstalk canceler having an equalization designed to leave unchanged at the output the power spectrum of a Gaussian binaural input with a specified crosscoherence. Another aspect of this embodiment is a canceler in which the crosscoherence of the input binaural signal is sensed and used to adapt the characteristics of the canceler.