This disclosure relates to a transfer function generation system and method.
An important feature of human hearing is that of the ability to localise sounds in the environment. Despite having only two ears, humans are able to locate the source of a sound in three dimensions; the interaural time difference and interaural intensity variations for a sound (that is, the time difference between receiving the sound at each ear, and the difference in perceived volume at each ear) are used to assist with this, as well as an interpretation of the frequencies of received sounds.
As the interest in immersive video content increases, such as that displayed using virtual reality (VR) headsets, the desire for immersive audio also increases. Immersive audio should sound as if it is being emitted by the correct source in an environment, that is the audio should appear to be coming from the location of the virtual object that is intended as the source of the audio; if this is not the case, then the user may lose a sense of immersion during the viewing of VR content or the like. While surround sound speaker systems have been somewhat successful in providing audio that is immersive, the provision of a surround sound system is often impractical.
In order to perform correct localisation for recorded sounds, it is necessary to perform processing on the signal so as to generate the expected interaural time difference and the like for a listener. In previously proposed arrangements, so-called head-related transfer functions (HRTFs) have been used to generate a sound that is adapted for improved localisation. In general, an HRTF is a transfer function that is provided for each of a user's ears and for a particular location in the environment relative to the user's ears.
In general, a discrete set of HRTFs is provided for a user and environment such that sounds can be reproduced correctly for a number of different positions in the environment relative to the user's head position. However, one shortcoming of this method is that there are a number of positions in the environment for which no HRTF is defined. Earlier methods, such as vector base amplitude panning (VBAP), have been used to mitigate these problems.
In addition to this, HRTFs are often not sufficient for their intended purpose; the required HRTFs differ from user to user, and so a generalised HRTF is unlikely to be suitable for a group of users. For example, a user with a larger head may expect a greater interaural time difference than a user with a smaller head when hearing a sound from the same relative position. In view of this, the HRTFs may also have different spatial dependencies for different users. The measuring of an HRTF can also be time consuming, expensive, and also suffer from distortions due to objects (such as the equipment in the room) in the HRTF measuring environment and/or a non-optimal positioning of the user within the HRTF measuring environment. There are therefore numerous problems associated with generating and utilising HRTFs.