Humans detect and process information arriving through a number of different channels. After light signals (vision), sound and hearing may contribute most heavily to one's perception of one's environment. The human auditory system is remarkably discriminating, and though it often fares poorly in comparisons with lower animals, people can detect subtle cues in an audio signal and use them to make inferences about their surroundings, even when those surroundings cannot be seen. The detection and inference occur largely subconsciously, so a carefully-prepared audio program can provide an extremely compelling and visceral experience for a listener.
Virtual reality and game applications can be greatly enhanced by an accurate audio rendering of a simulated environment. Unfortunately, producing a high-resolution, multi-channel audio stream that models the interaction of sounds from various sources with surfaces, spaces and objects in the simulated environment can be at least as computationally expensive as producing a sequence of high-resolution visual images of the same environment. For example, FIG. 2 shows a plan view of a simple, two-room environment, with a sound source 210 in one room 220 and a listener 230 in the other room 240. Like an optical ray tracer for rendering photo-realistic visual images, an “audio renderer” could compute the aggregate sound signal arriving at listener 230 by following sound or compression waves 250, 260, 270 emanating from the source 210, reflecting off walls and objects, and eventually arriving at the listener 230. Since the speed of sound is low compared to the speed of light, an audio-realistic rendition must account for propagation delays along various paths. These delays are perceived as phase differences and echoes, or more generally, reverberations.
One can easily imagine that the computational burden of producing a continuous high-quality stream of audio signals would overwhelm contemporary processing capabilities. Samples at a rate of 44.1 KHz or 48 KHz, for multiple channels, based on many audio sources at different locations relative to the listener, and within a complex and dynamic environment, translate to enormous volumes of data. Moreover, a sound produced at a first time may echo, reverberate and linger to affect the audio scene for several seconds. Less computationally-expensive approaches are essential for real-time simulations.
FIG. 3 shows a signal-processing network that can produce an adequate audio simulation for some environments, with only a fraction of the processing required for a full audio-realistic rendering. An input signal 110 enters the network, and a portion is passed directly through as “dry” signal 340. Several delayed versions of the signal 365, produced by delay lines of varying lengths (not shown) or by different “taps” on a single delay line 360, simulate discrete echoes produced when the input signal reflects off a wall or object and travels to the listener. Finally, feedback-delay network (“FDN”) 350 receives the delayed input signal and produces a diffuse, exponentially-decaying reverberation “tail” signal that resembles the indistinct, colored noise a listener perceives after the sound and its primary, distinct echoes have died away. The discrete echo signals 365 may be attenuated by amplifier/attenuators 370 and/or filtered by filters 380 to simulate different atmospheric conditions and reflective surfaces. The FDN output or the dry signal may also have their amplitude or spectral distribution adjusted. Finally, the dry signal, FDN output, and discrete echo signals are combined and distributed through a panning module 390 to prepare them for distribution and replay through a multi-channel speaker system.
The network shown in FIG. 3 produces a reasonable simulation of echoes and reverberations in simple, closed environments, but its effect is unconvincing for more complex audio environments with large, interconnected and/or unbounded resonating cavities. Techniques for efficiently producing convincing multi-channel audio reverberation effects in complicated environments may be of value.