Recent developments in audio technology have focused on creating real-time interactive positioning of sounds anywhere in the three-dimensional space surrounding a listener (the “sound field”). True interactive audio provides not only for the ability to create the sound on-demand but the ability to position that sound precisely in the sound field. Support for such technologies can be found in a variety of products but most frequently in software for video games to create a natural, immersive, and interactive audio environments. Applications extend beyond gaming into the entertainment world in the form of audiovisual products such as DVD, and also into video-conferencing, simulation systems and other interactive environments.
Advances in audio technology have proceeded in the direction of making the audio environment “real” to a listener. Surround sound developments followed, first in the analog domain with the development of HRTFs, Dolby Surround and later in the digital domain with AC-3, MPEG, and DTS to immerse a listener in the surround sound environment.
In order to portray a realistic synthetic environment, virtual sound systems use binaural technology and psychoacoustic cues to create the illusion of surround audio without the need for multiple speakers. The majority of these virtualized 3D audio technologies are based on the concept of HRTFs (Head-Related Transfer Functions). The original digitized sound is convolved in real-time with the left- and right-ear HRTFs corresponding to the desired spatial location, right- and left-ear binaural signals are produced, which when heard seem to come from the desired location. To position the sound, the HRTFs are changed to those for the desired new location and the process repeated. A listener can experience nearly free-field listening through headphones if the audio signals are filtered with that listener's own HRTFs. However, this is often impractical and experimenters have searched for a set of general HRTFs that have good performance for a wide range of listeners. This has been difficult to accomplish with a specific obstacle being front-back confusion, which describes the sensation that sounds either in front of or behind the head are coming from the same direction. Despite its drawbacks, HRTF methods have been successfully applied to both PCM audio and with much lessened computational load to compressed MPEG audio. Although virtual surround technologies based on HRTFs provide significant benefits in situations where full home theater set-ups are not practical these current solutions do not provide any means for interactive positioning of specific sounds.
The Dolby Surround system is another method to implement positional audio. Dolby Surround is a matrix process that enables a stereo (two-channel) medium to carry four-channel audio. The system takes four-channel audio and generates two channels of Dolby Surround encoded material identified as left total (Lt) and right total (Rt). The encoded material is decoded by a Dolby Pro-Logic decoder producing a four-channel output; a left channel, a right channel, a center channel and a mono surround channel. The center channel is designed to anchor voices at the screen. The left and right channels are intended for music and some sound effects, with the surround channel primarily dedicated to the sound effects. The surround sound tracks are pre-encoded in Dolby Surround format, and thus they are best suited for movies, and are not particularly useful in interactive applications such as video games. PCM audio can be overlaid on the Dolby Surround sound audio to provide a less controllable interactive audio experience. Unfortunately, mixing PCM with Dolby Surround Sound is content dependant and overlaying PCM audio on the Dolby Surround sound audio tends to confuse the Dolby Prologic decoder, which can create undesirable surround artifacts and crosstalk.
To improve channel separation digital surround sound technologies such as Dolby Digital and DTS provide six discrete channels of digital sound in a left, center and right front speakers along with separate left surround and right surround rear speakers and a subwoofer. Digital surround is a pre-recorded technology and thus best suited for movies and home A/V systems where the decoding latency can be accommodated and in its present form it is not particularly useful for interactive applications such as video games. However, since Dolby Digital and DTS provide high fidelity positional audio, have a large installed base of home theater decoders, definitions for a multi-channel 5.1 speaker format and product available for market, they present a highly desirable multi-channel environment for PCs and in particular console based gaming systems if they could be made fully interactive. However, the PC architecture has generally been unable to deliver multi-channel digital PCM audio to home entertainment systems. This is primarily due to the fact that the standard PC digital output is through a stereo based S/PDIF digital out put connector.
Cambridge SoundWorks offers a hybrid digital surround/PCM approach in the form of the DeskTop Theater 5.1 DTT2500. This product features a built-in Dolby Digital decoder that combines pre-encoded Dolby Digital 5.1 background material with interactive four-channel digital PCM audio. This system requires two separate connectors; one to deliver the Dolby Digital and one to deliver the 4-channel digital audio. Although a step forward, Desktop Theater is not compatible with the existing installed base of Dolby Digital decoders and requires sound cards supporting multiple channels of PCM output. The sounds are reproduced from the speakers located at known locations, but the goal in an interactive 3D sound field is to create a believable environment in which sounds appear to originate from any chosen direction about the listener. The richness of the DeskTop Theater's interactive audio is further limited by the computation requirements needed to process the PCM data. Sideways localization, which is a critical component of a positional audio environment is computationally expensive to apply on time-domain data, as are the operations of filtering and equalization.
The gaming industry needs a low cost fully-interactive low latency immersive digital surround sound environment suitable for 3D gaming and other interactive audio applications that allows the gaming programmer to mix a large number of audio sources and to precisely position them in the sound field and which is compatible with the existing infrastructure of home theater Digital Surround Sound systems.