1. Field of the Invention
This invention relates generally to three-dimensional (3D) sound. More particularly, it relates to an improved regularizing model for head-related transfer functions (HRTFs) for use with 3D digital sound applications.
2. Description of the Related Art
Many high-end consumer devices provide the option for three-dimensional (3D) sound, allowing a more realistic experience when listening to sound. In some applications, 3D sound allows a listener to perceive motion of an object from the sound played back on a 3D audio system.
Atal and Schroeder established cross-talk canceler technology as early as 1962, as described in U.S. Pat. No. 3,236,949, which is explicitly incorporated herein by reference. The Atal-Schroeder 3D sound cross-talk canceler was an analog implementation using specialized analog amplifiers and analog filters. To gain better sound positioning performance using two loudspeakers, Atal and Schroeder included empirically determined frequency dependent filters. Without doubt, these sophisticated analog devices are not applicable for use with today's digital audio technology.
Interaural time difference (ITD), i.e., the difference in time that it takes for a sound wave to reach both ears, is an important and dominant parameter used in 3D sound design. The interaural time difference is responsible for introducing binaural disparities in 3D audio or acoustical displays. In particular, when a sound object moves in a horizontal plane, a continuous interaural time delay occurs between the instant that the sound object impinges upon one of the ears and the instant that the same sound object impinges upon the other ear. This ITD is used to create aural images of sound moving in any desired direction with respect to the listener.
The ears of a listener can be “tricked” into believing sound is emanating from a phantom location with respect to the listener by appropriately delaying the sound wave with respect to at least one ear. This typically requires appropriate cancellation of the original sound wave with respect to the other ear, and appropriate cancellation of the synthesized sound wave to the first ear.
A second parameter in the creation of 3D sound is adaptation of the 3D sound to the particular environment using the external ear's free-field-to-eardrum transfer functions, or what are called head-related transfer functions (HRTFs). HRTFs relate to the modeling of the particular environment of the user, including the size and orientation of the listeners head and body, as they affect reception of the 3D sound. For instance, the size of a listener's head, their torso, what they wear, etc., forms a form of filtering which can change the effect of the 3D sound on the particular user. An appropriate HRTF adjusts for the particular environment to allow the best 3D sound imaging possible.
The HRTFs are different for each location of the source of the sound. Thus, the magnitude and phase spectra of measured HRTFs vary as a function of sound source location. Hence, it is commonly acknowledged that the HRTF introduces important cues in spatial hearing.
Advances in computer and digital signal processing technology have enabled researchers to synthesize directional stimuli using HRTFs. The HRTFs can be measured empirically at thousands of locations in a sphere surrounding the 3D sound environment, but this proves to require an excessive amount of processing. Moreover, the number of measurements can be very large if the entire auditory space is to be represented on a fine grid. Nevertheless, measured HRTFs represent discrete locations in a continuous auditory space.
One conventional solution to the adaptation of a discretely measured HRTF within a continuous auditory space is to “interpolate” the measured HRTFs by linearly weighting the neighboring impulse responses. This can provide a small step size for incremental changes in the HRTF from location to location. However, interpolation is conceptually incorrect because it does not account for environmental changes between measured points, and thus may not provide a suitable 3D sound rendering.
Other attempted solutions include using one HRTF for a large area of the three-dimensional space to reduce the frequency of discontinuities which may cause a clicking sound. However, again, such solutions compromise the overall quality of the 3D sound rendering.
Another solution wherein spatial characteristic functions are combined directly with Eigen functions to provide a set of HRTFs is shown in FIG. 3.
In particular, a set N of Eigen filters 422-426 are combined with corresponding sets of spatial characteristic function (SCF) samples 412-416 and summed in a summer 440 to provide an HRTF (or HRIR) filter 450 which acts on a sound source 460. The desired location of a sound image is controlled by varying the sound source elevation and/or azimuth in the sets of SCF samples 412-416. Unfortunately, this technique is susceptible to discontinuities in the continuous auditory space as well.
There is thus a need for a more accurate HRTF model which provides a suitable HRTF for source locations in a continuous auditory space, without annoying discontinuities.