To find the sound pressure that an arbitrary source x(t) produces at the ear drum, all that is required is the impulse response h(t) from the source to the ear drum. This is called the Head-Related Impulse Response (HRIR), and its Fourier transform H(f) is called the Head Related Transfer Function (HRTF). The HRTF models the sound filtering characteristics of the human pinna (projecting portion of the external ear) and torso (a human trunk) and captures all of the physical cues to the source localization. Once the HRTF for the left ear and the right ear are known, accurate binaural signals can be synthesized from a monaural source. Most HRTF measurements essentially reduce the HRTF to a function of a sound's azimuth, elevation and frequency.
FIG. 1A is a conceptual illustration of 3-D sound filtering using HRTF. Implementing 3D sound positioning requires filtering a monophonic, non-directional input sound 10 with left and right ear HRTFs 18a and 18b that are associated with a particular radial angle 12 from a listener's position 16. In some sound processing environments, this radial angle 12 is azimuthal. Typically, a software program inputs the sound 10 to a sound processor and specifies the angle 12 at which the input sound 10 should be filtered to be perceived as if it originated from that position. When the left ear HRTF 18a and right ear HRTF 18b associated with the specified angle 12 are applied to the input sound source 10, an Interaural Intensity Difference (IID) and an Interaural Time Difference (ITD) is established between the sounds that arrive at the listener's ears. The IID represents the difference in the intensity of the sound reaching the two ears, while the ITD represents the difference between the time that the sound reaches the left and right ears. Each HRTF includes a magnitude response and the phase response, where the magnitude response of the HRTF includes the IID, which is frequency dependent, and the phase response of the HRTF includes the ITD, which is frequency dependent.
In some sound processor architectures, minimum phase versions of the HRTF filters are used that no longer have the ITD inherent in the phase response of the filters. Instead, an ITD delay 22 representing the average group delay of each HRTF, is used to artificially insert the ITD by delaying the contralateral (far) ear's input sound sequence to the appropriate HRTF 18 by a number of samples. When designing a 3-D sound system, a designer may choose a particular library of HRTF measurements from different sources on the basis of user preference or behavioral data.
FIG. 1B is a block diagram graphically illustrating how minimum phase versions HRTF measurements are conventionally stored. Although many formats are available for storing a library of HRTF measurements 30, the library 30 typically includes the left HRTF 18a, the right HRTF 18b, and optionally the ITD 22 for each allowable angle increment of the input sound 12 from 0 and 360 degrees. Each HRTF 18 typically comprises some number of coefficients, e.g., thirty-two 16-bit coefficients is not uncommon. Rather than being stored, the ITD 22 may be calculated directly from the angle 12 specified for the input sound 10 during sound processing. Whether the ITD 22 is stored or calculated, what is important to note is that for what ever increment the source angle 12 may be specified, that same increment is used to select the ITD 22.
A problem with implementing 3D sound positioning in hardware is the large memory requirements for storing the filter coefficients of the HRTFs 18 for every angle 12 that is needed. If it is decided to store HRTFs 18 for every 1 degree of azimuth, for example and thirty-two, 16-bit coefficients are used per HRTF 18, then over 23000 bytes of memory would be required. This estimate assumes using symmetry of the head and only storing the left and right ear HRTFs for one side of the head, where the left and right ear HRTFs 18 would be swapped when positioning is done on the opposite side of the head. If elevational positioning is also implemented or if higher order filters are used, these storage requirements may quickly become a burden on the design. In low-cost designs, where die or board area is to be kept to a minimum, it is imperative to reduce these storage requirements as much as possible.
In determining the location of a 3D positioned sound, it is the ITD 22 that offers a more dominating perceptual cue over the IID. In this regard, it is important to provide a high degree of granularity with the 3D position angle in order to allow many more distinct 3D positions, largely created by the ITD 22. The shortcoming of this approach is the need to store the HRTF coefficients 18 along and to select the ITD 22 for all angles.
One possible method to reduce the storage requirements would be to use a larger angle increment, such as 10 degrees, rather than the 1 degree increment used in the example above. The tradeoff with such an implementation is not providing as many distinct positions to place the 3D sound. For a moving object that passes through several successive angles, this would likely create jumpiness in the sound and, in the case when interpolative smoothing is not implemented, the sound will severely crackle.
In an attempt to overcome the shortcomings of the above implementation in which large angle granularity is used, it may seem natural to allow smaller granularity by measuring less angles and simply interpolate HRTF coefficients 18 of the missing angles. Besides the obvious computational cost of having to do so, interpolation in the time domain will not result in a magnitude response that lies between the two available HRTFs 18. This would likely create distorted magnitude responses for the interpolated HRTFs, and interpolating in the frequency domain with any degree of accuracy is much too costly.
Accordingly, what is needed is a method and system for reducing HRTF storage requirements for 3-D sound positioning. The present invention addresses such a need.