This invention relates to a system and method of creating 3D audio filters for head-externalized 3D audio through headphones (which for purposes of this application shall be deemed to include headphones, earphones, ear speakers or any transducers in close proximity to a listener's ears), and more particularly to filter designs for providing high quality 3D head-externalized 3D audio through headphones.
The invention has wide utility in virtually all applications where audio is delivered to a listener through headphones, including music listening, entertainment systems, pro audio, movies, communications, teleconferencing, gaming, virtual reality systems, computer audio, military and medical audio applications.
Prior art systems and processes used for the head-externalization of audio through headphones rely on one, or a combination, of the following two methods. The first of these prior art methods (PA Method 1) uses binaural audio, i.e. audio that is acoustically recorded with dummy head microphones, or audio that is mixed binaurally on a computer using the numerical HRIR (head-related impulse response) of a dummy head or a human head. The problem with this method is that it can lead to good head externalization of sound for only a small percentage of listeners. This well documented failure to head externalized binaural sound through regular headphones for virtually any listener is due to many factors (see, for instance, Rozenn Nicol, Binaural Technology, AES Monographs series, Audio Engineering Society, April 2010), One such factor is the mismatch between the HRIR of the head used to record the sound and the HRIR of the actual listener. Another important factor is the lack of robustness to head movements: the perceived audio image moves with the head as the listener rotates his head, and this artifice degrades the realism of the perception. With PA Method 1 it is impossible to use existing head tracking techniques to fix the perceived audio image because the locations of sound sources is generally unknown in an already recorded sound field.
The second prior art method (PA Method 2) filters the audio through digital (or analog) filters that represent or emulate the binaural impulse response of loudspeakers in a listening room. (such filters are referred to as SRbIR filters, where “SRbIR” stands for “Speakers+Room binaural Impulse Response”). An advantage of this method over PA Method 1 is that existing head tracking techniques can readily be used to fix the perceived audio image in space (thereby greatly increasing the robustness to head movements and therefore enhancing the realism of the perceived sound field) as the location of the speakers is effectively known since convolution of the input audio with the SRbIR measured or calculated at various head positions (three positions covering the range of expected head rotation are usually sufficient to extrapolate the SRbIR at other head rotation angles) could be changed as a function of the head location using head tracking so that the listener perceives the sound coming from loudspeakers that are fixed in space. However, while PA Method 2 can lead to good head externalization of sound, it emulates the sound of regular loudspeakers whereby the sound is not truly three-dimensional (i.e. does not extend significantly in 3D space beyond the region where the loudspeakers are perceived to be located.)
Combining these two prior art methods can lead to good head externalization of sound and the ability to use head tracking but the benefits of the binaural audio are largely lost as the sound of binaural audio through regular loudspeakers is not truly 3D since the transmission of the inter-aural time difference (ITD), inter-aural level difference (ILD) and spectral cues in the binaural recording through loudspeakers is severely degraded by the crosstalk (the sound from each loudspeaker reaching the unintended ear).
Although not reported in the literature or in any known prior art, it would seem possible to make the second process described above yield high quality 3D sound (while still head externalizing the sound) by using, in addition to the SRbIR filter, a crosstalk cancellation (XTC) filter with the goal of emulating the sound of crosstalk-cancelled loudspeakers playback. Such a process, however, does not yield the desired quality sound because a regular XTC filter will remove or significantly degrade the crosstalk that is inherently represented in the SRbIR filter and which is critical for head externalization of sound through headphones.
It is therefore a principal object of the present invention to provide and system and process for providing more effective head-externalization of 3D audio through headphones.