User interfaces have traditionally relied on input devices such as keyboards, which require physical manipulation by a user. Increasingly, however, it is desired to detect and monitor the physical positions and movements of users and objects within an environment.
In certain situations, it may be desired to determine the locations of one or more audio sources within an environment. Time-of-flight measurements can be used to determine locations of certain types of audio sources. However, time-of-flight calculations depend on knowing the time at which sounds were generated. In many situations, time-of-flight measurements are not possible because it is not possible to determine the origination time of received sounds. However, it may be possible to use “time-of-arrival” or “time-difference-of-arrival” techniques to determine the locations of certain types of audio sources. Time-difference-of-arrival uses microphones at multiple locations to detect arriving audio. Assuming that a discrete event can be detected in the audio, the time of arrival of that event may be compared between different microphones to determine the likely location of the audio source relative to the microphones.
Unfortunately, existing time-difference-of-arrival techniques are primarily suitable for transient or pulse-like sounds, for which a distinctive characteristic of the received audio signal may be reliably identified. Other types of audio, such as human speech and many other sounds, remain difficult to localize.