Binaural rendering is a signal processing technique for creating stereo audio signals which, when delivered through headphones, are perceived by a user to originate from a real-world sound source at a specific spatial location. This technique can be applied to create very realistic auditory virtual realities for entertainment, gaming, and e-tourism purposes, as well as for more serious applications like education and training, remote telepresence, and spatial situation awareness displays. The fundamental technology for generating this virtual audio illusion is known as the head-related transfer function (“HRTF”), a set of user-specific filters which capture all perceptually relevant sound localization cues. When a user-specific HRTF cannot be used, general performance of the binaural rendering is degraded for the majority of users, an increase in large localization errors is observed, users exhibit poor sound source externalization, and a there is a perception of decreased sense of presence in the auditory virtual environment.
The measurement of an individualized HRTF is time and cost-prohibitive for the average potential user of binaural rendering technologies. Present technologies for measuring an HRTF for a given user require complex equipment, hard-to-find acoustically treated anechoic environments, or both, making widespread use of true individualized HRTFs impractical for most commercial applications. Instead, many researchers have proposed techniques for HRTF personalization where existing non-individualized HRTFs are either selected or customized based on a user's physical dimensions, subjective evaluation, or objective performance on an auditory task. While such techniques have exhibited certain performance benefits over one-size-fits-all generic HRTFs, no current technique based on personalization provides objective localization performance on par with an individually measured HRTF.
Thus, it would be advantageous to provide an improved methodology for selecting a HRTF for binaural rendering of audio signals that are perceived by a user to originate from a real-world spatial location.