Embodiments described herein relate generally to spatial audio, and more particularly to the generation and processing of realistic audio based on a user's orientation and positioning to a source of audio located in reality, virtual reality, or augmented reality. Spatial audio signals are being used in greater frequency to produce a more immersive audio experience. A stereo or multi-channel recording may be passed from a recording apparatus to a listening apparatus and may be replayed using a suitable multi-channel output, such as a multi-channel speaker arrangement or with virtual surround processing in stereo headphones or a headset.
Typically, spatial audio is produced for headphones using binaural processing to create the impression that a sound source is at a specific 3D location. Binaural processing may mimic how natural sound waves are detected and processed by humans. For example, depending on where a sound originates, it may arrive at one ear before the other (i.e., interaural time difference (“ITD”)), it may be louder at one ear than the other (i.e., interaural level Difference (“ILD”)), and it may bounce and reflect with specific spectral cues. Binaural processing may use head-related transfer function (“HRTF”) filters to model the ITD, ILD, and spectral cues separately at each ear, process the audio, and then play the audio through two-channel headphones. Binaural processing may involve rendering the same sounds twice: once for each ear.
To measure HRTFs, a human subject, or analog, may be placed in a special chamber designed to prevent sound from reflecting off the walls. Speakers may be placed at a fixed distance from the subject in various directions. Sound may be played from each speaker in turn and recordings may be made using microphones placed in each of the subject's ears.