In order to create a more immersive audio experience, binaural audio rendering can be used so as to impart a sense of space to 2-channel stereo and multichannel audio programs when presented over headphones. Generally, the sense of space can be created by convolving appropriately-designed Binaural Room Impulse Responses (BRIRs) with each audio channel or object in the program, wherein the BRIR characterizes transformations of audio signals from a specific point in a space to a listener's ears in a specific acoustic environment. The processing can be applied either by the content creator or by the consumer playback device.
An approach of virtualizer design is to derive all or part of the BRIRs from either physical room/head measurements or room/head model simulations. Typically, a room or room model having very desirable acoustical properties is selected, with the aim that the headphone virtualizer can replicate the compelling listening experience of the actual room. Under the assumption that the room model accurately embodies acoustical characteristics of the selected listening room, this approach produces virtualized BRIRs that inherently apply the auditory cues essential to spatial audio perception. Auditory cues may, for example, include interaural time difference (ITD), interaural level difference (ILD), interaural crosscorrelation (IACC), reverberation time (e.g., T60 as a function of frequency), direct-to-reverberant (DR) energy ratio, specific spectral peaks and notches, echo density and the like. Under ideal BRIR measurements and headphone listening conditions, binaural audio renderings of multichannel audio files based on physical room BRIRs can sound virtually indistinguishable from loudspeaker presentations in the same room.
However, a drawback of this approach is that physical room BRIRs can modify the signal to be rendered in undesired ways. When BRIRs are designed with adherence to the laws of room acoustics, some of the perceptual cues that lead to a sense of externalization, such as spectral combing and long T60 times, also cause side-effects such as sound coloration and time smearing. In fact, even top-quality listening rooms will impart some side-effects to the rendered output signal that are not desirable for headphone reproduction. Furthermore, the compelling listening experience that can be achieved during listening to binaural content in the actual measurement room is rarely achieved during listening to the same content in other environments (rooms).