Throughout this disclosure including in the claims, the term “virtualizer” (or “virtualizer system”) denotes a system coupled and configured to receive N input audio signals (indicative of sound from a set of source locations) and to generate M output audio signals for reproduction by a set of M physical speakers (e.g., headphones or loudspeakers) positioned at output locations different from the source locations, where each of N and M is a number greater than one. N can be equal to or different than M. A virtualizer generates (or attempts to generate) the output audio signals so that when reproduced, the listener perceives the reproduced signals as being emitted from the source locations rather than the output locations of the physical speakers (the source locations and output locations are relative to the listener). For example, in the case that M=2 and N>3, a virtualizer downmixes the N input signals for stereo playback. In another example in which N=M=2, the input signals are indicative of sound from two rear source locations (behind the listener's head), and a virtualizer generates two output audio signals for reproduction by stereo loudspeakers positioned in front of the listener such that the listener perceives the reproduced signals as emitting from the source locations (behind the listener's head) rather than from the loudspeaker locations (in front of the listener's head).
Throughout this disclosure including in the claims, the expression “rear” location (e.g., “rear source location”) denotes a location behind a listener's head, and the expression “front” location” (e.g., “front output location”) denotes a location in front of a listener's head. Similarly, “front” speakers denotes speakers located in front of a listener's head and “rear” speakers denotes speakers located behind a listener's head.
Throughout this disclosure including in the claims, the expression “system” is used in a broad sense to denote a device, system, or subsystem. For example, a subsystem that implements a virtualizer may be referred to as a virtualizer system, and a system including such a subsystem (e.g., a system that generates M output signals in response to X+Y inputs, in which the subsystem generates X of the inputs and the other Y inputs are received from an external source) may also be referred to as a virtualizer system.
Throughout this disclosure including in the claims, the expression “reproduction” of signals by speakers denotes causing the speakers to produce sound in response to the signals, including by performing any required amplification and/or other processing of the signals.
Virtual surround sound can help create the perception that there are more sources of sound than there are physical speakers (e.g., headphones or loudspeakers). Typically, at least two speakers are required for a normal listener to perceive reproduced sound as if it is emitting from multiple sound sources.
For example, consider a simple surround sound virtualizer coupled and configured to receive input audio from three sources (left, center and right) and to generate output audio for two physical loudspeakers (positioned symmetrically in front of a listener) in response to the input audio. Such a virtualizer asserts input from the left source to the left speaker, asserts input from the right source to the right speaker, and splits input from the center source equally between the left and right speakers. The output of the virtualizer that is indicative of the input from the center source is commonly referred to as a “phantom” center channel. A listener perceives the reproduced output audio as if it includes a center channel emitting from a center speaker between the left and right speakers, as well as left and right channels emitting from the left and right speakers.
Another conventional surround sound virtualizer (shown in FIG. 1) is known as a “LoRo” or left-only, right-only downmix virtualizer. This virtualizer is coupled to receive five input audio signals: left (“L”), center (“C”) and right (“R”) front channels, and left-surround (“LS”) and right-surround (“RS”) rear channels. The FIG. 1 virtualizer combines the input signals as indicated, for reproduction on left and right physical loudspeakers (to be positioned in front of the listener): the input center signal C is amplified in amplifier G, and the amplified output of amplifier G is summed with the input L and LS signals to generate the left output (“Lo”) asserted to the left speaker and is summed with the input R and RS signals to generate the right output (“Ro”) asserted to the right speaker.
Another conventional surround sound virtualizer is shown in FIG. 2. This virtualizer is coupled to receive five input audio signals (left (“L”), center (“C”), and right (“R”) front channels representing L, C, and R front sources, and left-surround (“LS”) and right-surround (“RS”) rear channels representing LS and RS rear sources) and configured to generate a phantom center channel by splitting input from center channel C equally between left and right signals for driving a pair of physical front loudspeakers (positioned in front of a listener). The virtualizer of FIG. 2 is also configured to use virtualizer subsystem 10 in an effort to generate left and right outputs LS′ and RS′ useful for driving the front loudspeakers to emit sound that the listener perceives as reproduced input rear (surround) sound emitting from RS and LS sources behind the listener. More specifically, virtualizer subsystem 10 is configured to generate output audio signals LS′ and RS′ in response to rear channel inputs (LS and RS) including by transforming the inputs in accordance with a head-related transfer function (HRTF). By implementing an appropriate HRTF, virtualizer subsystem 10 can generate a pair of output signals that can be reproduced by two physical loudspeakers located in front of a listener so that the listener perceives the output of the loudspeakers as being emitted from a pair of sources positioned at any of a wide variety of positions (e.g., positions behind the listener's head). The FIG. 2 virtualizer also amplifies the input center signal C in amplifier G, and the amplified output of amplifier G is summed with the input L signal and LS′ output of subsystem 10 to generate the left output (“L′”) for assertion to the left speaker, and is summed with the input R signal and RS′ output of subsystem 10 to generate the right output (“R′”) for assertion to the right speaker.
It is conventional for virtual surround systems to use head-related transfer functions (HRTFs) to generate audio signals that, when reproduced by a pair of physical speakers positioned in front of a listener are perceived at the listener's eardrums as sound from loudspeakers at any of a wide variety of positions (including positions behind the listener). A disadvantage of conventional use of one standard HRTF (or a set of standard HRTFs) to generate audio signals for use by many listeners (e.g., the general public) is that an accurate HRTF for each specific listener should depend on characteristics of the listener's head. Thus, HRTFs should vary greatly among listeners and a single HRTF will generally not be suitable for all or many listeners.
If two physical loudspeakers (as opposed to headphones) are used to present a virtualizer's audio output, an effort must be made to isolate the sound from the left loudspeaker to the left ear, and from the right loudspeaker to the right ear. It is conventional to use a cross-talk canceller to achieve this isolation. In order to implement cross-talk cancellation, it is conventional for a virtualizer to implement a pair of HRTFs (for each sound source) to generate outputs that, when reproduced, are perceived as emitting from the source location. A disadvantage of traditional cross-talk cancellation is that the listener must remain in a fixed “sweet spot” location to obtain the benefits of the cancellation. Usually, the sweet spot is a position at which the loudspeakers are at symmetric locations with respect to the listener, although asymmetric positions are also possible.
Virtualizers can be implemented in a wide variety of multi-media devices that contain stereo loudspeakers (televisions, PCs, iPod docks), or are intended for use with stereo loudspeakers or headphones.
There is a need for a virtualizer with low processor speed (e.g., low MIPS) requirements and low memory requirements, and with improved sonic performance. Typical embodiments of the present invention achieve improved sonic performance with reduced computational requirements by using a novel, simplified filter topology.
There is also a need for a surround sound virtualizer which emphasizes virtualized sources (e.g., virtualized surround-sound rear channels) in the mix determined by the virtualizer's output when appropriate (e.g., when the virtualized sources are generated in response to low-level rear source inputs), while avoiding excessive emphasis of the virtual channels (e.g., avoiding virtual rear speakers being perceived as overly loud). Embodiments of the present invention apply dynamic range compression during generation of virtualized surround-sound channels (e.g., virtualized rear channels) to achieve such improved sonic performance during reproduction of the virtualizer output. Typical embodiments of the present invention also apply decorrelation and cross-talk cancellation for the virtualized sources to provide improved sonic performance (including improved localization) during reproduction of the virtualizer output.