There is a growing interest to improve methods and systems for audio displays that can present audio signals conveying accurate impressions of three dimensional sound fields. The audio display systems utilize techniques that model the transfer of acoustic energy in a sound environment from one point to another. The realism of an acoustic display can be enhanced by including ambient effects. One important effect is caused by reflections. A listener hears the sound not only directly from the source but also as reflections of the sound from nearby objects. In most environments, a sound field comprises sound waves arriving at a particular point, such as a listener's ear, along a direct path from the sound source and along paths reflecting off one or more surfaces of walls, floor, ceiling, and other objects. Sounds cannot only be heard as emanating from a sound source, but also as they are reflected off of walls, leak through doors from an adjoining room, get occluded as they disappear around a corner, or suddenly appear overhead as a listener steps into the open from a room.
Once a sound wave has been emitted, it travels through an environment where several things happen. The sound can travel directly to the listener (direct path), bounce off an object once and then reach the listener (first order reflected path), bounce off two surfaces before reaching the listener (second order reflected path), and so on. Second and higher order reflections usually combine to form late field reflections, or reverb. The direction of arrival for a reflection is generally not the same as that of the direct path sound wave. The propagation path of a reflected sound wave is longer than a direct path sound wave, thus reflections arrive later. In addition, the amplitude and spectral content of a reflection will generally differ because of energy absorbing qualities of the reflective surfaces. Reflections add to the naturalness and immersiveness of the sound field and provide cues to the size, shape, and composition of the acoustic environment.
In addition to the variable propagation delay of reflections, the time at which sounds are heard by a right ear and left ear of a listener varies based on the location of the source of the sound due to interaural time difference (ITD). Interaural time difference refers to the fact that a sound will typically arrive earlier at one ear than at the other ear. If the sound arrives at the left ear first, for example, the listener's brain knows that the sound is somewhere to the left.
The material from which the reflecting object is made affects the way the sound reflects off and transmits through an object. Each time a sound is reflected off of an object, the material of the object has an effect on how much each frequency component of the sound wave is absorbed, and how much is reflected back into the environment. For example, a carpeted room sounds very different from a glass room. An object's material characteristics can be measured empirically by recording known sounds as they bounce off of materials and modeled as a gain value, for example. Wall surface materials and acoustic space geometries are typically stored in a database for use by a sound processor.
Sound processors are designed to simulate the acoustics of an environment relative to a listener. The processor simulates direct path propagation, reflections, and other acoustic effects. For example, effects of reflection and ITD may be synthesized by appropriately delaying the source signal. Individual reflections are typically modeled as copies of an original signal modified with appropriate spectral, positional, and temporal cues. The output is the summation of the individual reflections, direct paths, and other acoustic effects. An example is the simulation of a person talking inside a rectangular room having carpeted walls. The signals include a direct path signal and six first-order reflections (one for each of the four walls, floor, and ceiling). Propagation distance and direction of arrival for the seven signals is determined from information about the acoustic space, including room geometry and source and listener locations. In order to simulate the different propagation distances, each signal is delayed an amount proportional to the propagation distance. Amplitude and spectral cues are added to each signal for propagation effects such as distance, attenuation, and atmospheric absorption. Gain, delay, and spectral effects are added to each signal to provide localization cues based on the direction of arrival of the sound. Pitch of the signals may also vary due to Doppler effects when the listener or source is moving. Reflections also have amplitude and spectral cues added to them based on the reflective properties of the walls. All of these added cues may change continuously due to changes in the simulation or environment (e.g., change in position of source or listener). The output is a summation of the direct path and six reflections, each having different delays, gains, pitch, and spectral effects, which produce the perception of a person talking inside the modeled room.
Conventional audio processors provide the variable delays used to simulate propagation distances by positioning taps (a, b, c) at different locations along a delay line buffer B, located on a host computer, for example (FIG. 1). The input data D enters at the left of the buffer B (as viewed in FIG. 1) and as the data moves to the right, a signal is first output at tap a for the direct path signal after a first delay, to allow for propagation of the signal from the source to the listener. A signal is next output after a second delay at tap b to model a first reflection, and after a third delay, a signal is output at tap c to model a second reflection. As shown in FIG. 1, the output sound signal is created by the summation of direct path and reflection signals. Each time the sound source or listener moves, the location of each tap must also be moved to either increase or decrease the initial delay to compensate for changes in propagation path distances between the sound source and the listener. The location of the tap can vary significantly from its original position. Interpolation is required to smoothly move the taps without audible artifacts. Interpolator output is typically calculated over a window of data samples centered at the desired delay location. Interpolation quality improves as the window width increases, but with proportional increase in computational cost.
The computational cost of performing interpolation, as well as acoustic processing for propagation, reflection, and localization effects for a large number of reflections is significant. While this processing may be performed using special purpose hardware, the amount of special purpose memory required to store the delay lines is high. For example, a one-half second delay at a 48 kHz sampling rate requires the storage of 24,000 samples.
There is, therefore, a need for a system and method of efficiently rendering sound reflections in special purpose hardware with limited memory requirements.