The present invention relates generally to the field of audio signal processing, and more particularly to a method and apparatus for crosstalk cancellation in an audio system.
There are a number of applications in which the left and right channels of a stereo audio signal are developed independently for the left ear and the right ear of a listener. For example, systems have been developed to simulate a virtual sound source in an arbitrary perceptual location relative to a listener. These so-called virtual acoustic displays apply separate left-ear and right-ear filters to a source signal in order to mimic the acoustic effects of the human head, torso, and pinnae on source signals arriving from a particular point in space. These filters are referred to as head related transfer functions (HRTF's). HRTF's are functions of position and frequency which are different for different individuals. When a sound signal which is passed through a filter that implements the HRTF for a given position, the sound appears to the listener to have originated from that position.
HRTF's used in virtual audio displays are often tabulated based on measurements made by recording the response at a listener's ears to test signals generated by speakers placed at various locations around the listener. In systems where the measurement of each individual user is impractical, a generic HRTF known to work well for a wide population of listeners is used. Methods have been developed to efficiently implement virtual audio displays using HRTF's. U.S. Pat. No. 5,404,406 issued to Fuchigami et. al. which is herein incorporated by reference describes one implementation of a virtual audio display using HRTF's. U.S. patent application Ser. Nos. 08/303,705 and 08/241,867 of Abel, which are each herein incorporated by reference, also teach efficient implementations of virtual audio displays.
Since sound signals can be created which appear to emanate from arbitrarily positioned virtual sound sources, it is possible to create left and right ear signals which appear to a listener to have originated from a set of virtual speakers. FIG. 1 illustrates a system which creates five virtual speakers for a listener 101 which include a left virtual speaker 110, a right virtual speaker 112, a center virtual speaker 114, a right rear virtual speaker 116, and a left rear virtual speaker 118 when listener 101 is wearing headphones which include right speaker 120 and left speaker 122. Such a system is useful, for example, for listening to Dolby ProLogic encoded source material over headphones. Once the HRTF is determined from each of the five speakers to each of the listener's ears, a sound signal from each of the virtual speaker is generated for the listener's left and right ears by passing the signal associated with each virtual speaker through a filter which implements the HRTF corresponding to that speaker's position with respect to the left or right ear of listener 101. The resulting signal is then played through headphone speaker 120 and headphone speaker 122 at the listener's right and left ears, respectively.
Stereo audio streams in which the left and right channels are developed independently for the left and right ears of a listener are referred to as binaural signals. Headphones are typically used to send binaural signals directly to a listener's left and right ears. The main reason for using headphones is that the sound signal from the speaker on one side of the listener's head generally does not travel around the listener's head to reach the ear on the opposite side. Therefore, the application of the signal by one headphone speaker to one of the listener's ears does not interfere with the signal being applied to the listener's other ear by the other headphone speaker through an external path. Headphones are thus an effective way of transmitting a binaural signal to a listener, however, it is not always convenient to wear headphones or earphones (for example, in the case of an arcade game where maintenance and hygiene concerns arise), and a solution using a pair of speakers which are not worn as headphones is desired.
Complications arise in systems which do not deliver the audio signal directly to the listener's ear. If a binaural signal is used to drive free standing speakers directly, then the listener will hear contributions from each speaker at each ear. The receipt of the signal intended for the right ear at the left ear and vice versa is referred to as "crosstalk." It is necessary in such systems to compensate for or to cancel somehow the crosstalk so that the desired binaural signal is effectively applied to each of the listener's ears.
Various systems for canceling crosstalk have been developed. B. S Atal and M. R. Schroeder developed a system which implements crosstalk cancellation over the entire audio spectrum (i.e., 20 Hz to 20 Khz). The system is described in "Computer Models for Concert Hall Acoustics" The American Journal of Physics, vol. 41, pp. 461-471 (April 1973) which is herein incorporated by reference; "Models of Hearing" IEEE Proceedings vol. 63, p. 1332-1350 (Sept. 1975) which is herein incorporated by reference; and U.S. Pat. No. 3,236,949 issued to Atal et. al which is herein incorporated by reference. The Atal and Schroeder system reproduces arbitrarily located sound images with two loudspeakers using a crosstalk cancellation system which includes an equalization filter. When using the Atal Schroeder crosstalk canceler and equalization filter, the crosstalk signals are exactly canceled and the input binaural signal appears intact at the listener's ears if the system is designed using the listener's HRTF and the listener is in the exact designed position relative to the speakers. The Atal and Schroeder system works reasonably well for a listener whose HRTF reasonably approximates the HRTF for which the system is designed, and whose head is positioned and oriented correctly in a so-called "sweet spot." However, if the listener's head is turned or positioned away from the sweet spot, or if the listener's HRTF is not a close approximation of the HRTF for which the system is designed, then the crosstalk cancellation is not effective and accurate localization of sounds by the listener is no longer realized.
U.S. Pat. Nos. 4,910,779, 4,975,954, 4,893,342, 5,034,983, and U.S. Pat. No. 5,136,651 issued to Cooper et. al, each of which are herein incorporated by reference, describe a system which limits the response of the crosstalk canceling filter used to a frequency substantially below 10 Khz and also implements a different equalization filter than the equalization filter which is described by Atal and Schroeder.
What is needed is an apparatus and method for canceling the crosstalk between signals from speakers which effectively cancels the crosstalk when the HRTF of the listener is close to a standard HRTF and the listener's head is in a standard location, and which is also robust so that the system performs reasonably well and undesirable sound effects are not heard by a listener whose HRTF varies from the designed for HRTF or whose head is not positioned and oriented correctly in the standard location. Such a system could be used to effectively simulate an array of five virtual speakers using only two loudspeakers or to present sounds to a listener which appear to come from arbitrarily placed sources.