1. Field of the Invention
The invention relates to artificial reverberation and ambiance systems that create the illusion of real acoustic spaces, and more specifically to systems for simulating more accurately the natural statistics of a physical reverberation process.
2. Background of the Invention
In an acoustic space, the sound travels from the source to the listener via many different paths. The direct path, referred to as the “dry” signal, corresponds to the signal of an anechoic space, an outdoor performance, or a close microphone. The indirect path, referred to as the “wet” signal, bounces off the walls or other surfaces multiple times and appears at the listener delayed, attenuated, and spectrally modified. The process continues with more and more reflections arriving at the listener later and later. Thus, all reverberation processes can be thought of as a rapidly increasing series of reflections or echoes. In audio signal processing, “ambiance” is the general sense of creating an illusion of a space. “Reverberation” is more specifically the decay process once the process has become so complex that only a statistical representation is useful. Ambiance includes reverberation and spatial location of the listener. Since the earliest days of radio broadcasting and recorded music, the need for artificial reverberation and spatial ambiance has been well known. When a microphone is placed close to a performer to avoid picking up background noise, very little of the environmental reflections appear in the signal. Reverberation, created by an artificial system, is added to the microphone signal to recreate the perception of the acoustic space.
Typically, signal processing systems often fail to accurately duplicate natural acoustics for both practical and theoretical reasons. The echo pattern or impulse response is extremely complex and only the early part of the process can be characterized in detail. Impulse response measurement decays into the noise level because the amount of source energy is limited to a fraction of the atmospheric pressure. Averaging cannot be used because of the lack of thermal equilibrium, which causes minor changes in the speed of sound. For the same reason, frequency response measurements never reach a stable steady state. Also, the computational burden on ray tracing algorithms grows exponentially. Moreover, the detailed echo pattern is different for each seat in every space and the physics of a 3-dimensional space is fundamentally different from the 1-dimension properties of signal processing. Along a given axis, the speed of sound varies as a function of the angular orientation, giving a sound wave two extra degrees of freedom. In a signal processing system, signals travel at a fixed rate through all delay lines. Fortunately, the physical complexity of the process does not match the human's ability to perceive minor differences. As a result, there is a large body of research that teaches which attributes are perceptible.
Because of these factors, artificial systems attempt to create the perceptual illusion of a real acoustic space without trying to be physical simulations thereof. Large spaces have been better analyzed in terms of statistics rather than in terms of an explicit impulse response. When an artificial system has similar statistics the illusion is good; when those statistics differ, the illusion is weak. However, the prior art shows a tendency to either use an artistic or purely mathematical approach. There is very little prior work that provides a formalism that can relate the statistical properties of an artificial system to the perceptually relevant properties of acoustic performance spaces.
The current generation of artificial reverberation systems is based on a few digital signal processing primitives including delay lines, multipliers, and adders. From these primitives, more complex elements are created. A comb filter, that is implemented with a delay line having feedback of less than 1, derives its name from the fact that the frequency response is comb shaped. An allpass filter, which is a comb filter combined with a feedforward path, has a flat frequency response, hence the name allpass. An energy transmission network, which is composed of a large delay line, holds and transmits the energy between different parts of the system. These transmission networks are directly analogous to a linear path through cubic sections of air in a real acoustic space, which also hold and transmit acoustic energy. Other elements, such as filters, are added to provide the more subtle attributes such as high frequency absorption of wall surfaces and the air itself.
Historically, reverberation systems were based on one of two basic topologies. One of these topologies uses a multiplicity of comb filters fed in parallel from a series of allpass filters. The other topology uses a single large loop composed of a multiplicity of delays, lowpass filters, and allpass filters. About 15 years ago, these structures were represented as static mixers using a matrix notation. Not only can the matrix notation represent all of the basic reverberation topologies by selecting appropriate numbers, but the notation can also be used to create other topologies as long as the numbers are constrained to a set of mathematical rules. The matrix column vectors should have unity magnitude and each of these vectors should be orthogonal to all the others. Any set of numbers that satisfies these rules is the to be a unitary orthogonal matrix. However, there are an infinity of matrix numbers that satisfies the mathematical rules. The prior art does not teach a selection criterion. Constraining the mixer coefficients to be consistent with these rules is necessary, but not sufficient, to create a high quality reverberation system because the rules ignore additional perceptual issues that are unrelated to the mathematical formalisms. The various necessary perceptual optimizations often conflict with one another; optimizing one, while de-optimizing the other.
Another problem of the prior art is that it is limited to using a relatively small number of energy transmission networks because of their high economic burden. It is essentially impossible to fill these networks uniformly because statistical averaging requires a large number of such elements. In a real acoustic space, the energy density becomes uniform after the reverberation process has continued for a modest amount of time. Artificial systems show a much weaker tendency to produce a uniform energy distribution and, hence, often produce energy periodicity in the reverberation, which is perceptible to the listener.
Because there are different classes of defects in reverberators, it is useful to first consider two extreme classes of sound: broadband pulses and narrow band continuous signals. The former is typical of a pluck on a guitar or a bang of wood blocks, while the latter is typical of a flute, organ and other instruments that have a long steady state. Most music falls between these two cases. In discussing reverberations systems, professional audio engineers will often refer to the system behavior when excited either by an impulse or by a steady state sinewave, representing the two extreme cases. A defective reverberation system will show undesirable properties with one or both of these cases. Typically, an audio engineer first looks for the perceptual smoothness of the reverberation tail as the dominant quality criterion. Secondly, he also looks for spectral coloration in the tail. Does the spectrum of the reverberation have the same spectral content as the original? The untrained listener does not detect these defects explicitly but has the sense that the reverberation is not quite right. Professional sound engineers are, however, very sensitive to even the most minor defects. Experts in the field can catalog dozens of critical cases that form the tool chest for evaluation. The fundamental difficulty in creating the illusion of reverberation derives from the fact that optimizing one set of properties often de-optimizes others.
The prior art has a strong proclivity to describe artificial reverberation systems as being a complex linear and time-invariant filter. The linearity property dictates that scaling the input by a factor will typically scale the output by the same factor. The time-invariant property dictates that shifting the input in time will typically shift the output by the same amount. A similar approach has been rejected when applied to a large acoustic space, such as a concert hall, because it is very misleading and unproductive. It is useful only in the degenerate example of a small and rectangular shaped acoustic space. A concert hall might easily have more than 100,000 resonances and 100,000 discrete echoes. Scientists therefore often use a statistical notation that talks about the frequency response in terms of its average, standard deviation, slope rates, etc. The reverberation decay is described in terms of the spectrum of the amplitude envelope and spectrum of the phase changes. An impulse response can be described in terms of the spectrum of energy variation within a 1 msec. time window. The scientific literature shows that there have been some notable successes in mapping the statistical metrics to the perceptual properties.
In contrast, artificial systems are generally very limited in their complexity. Audio engineers have traditionally not described the reverberation response in terms of statistics but have stayed with the deterministic notion of a linear, time-invariant construct. Some reverberation designers abandon all structured methods and resort to a purely artistic creation process. Because artificial systems are generally built out of less than 20 network modules with only some 50 free parameters, there is no obvious method to incorporate a statistical approach into the design process. There are simply not enough degrees of freedom. Consider, for example, how the resonance density behaves when excited with a narrow band musical note. A large concert hall can easily have a density of well over 10 resonance per Hz, whereas an artificial system might have only 0.3 resonances per Hz. When the density is extremely high, many excited resonances contribute to the response resulting in a random envelope and phase response. With only 2 excited resonances, the envelope will have a characteristic beat tone at a frequency equal to the difference between the two resonances, which sounds very unnatural. This problem has been intuitively understood but has not been extensively studied. Historically, the solution has involved some kind of isolated parameter randomization. For example, a delay can be slowly changed, or a delay output can be panned between random values. All of these methods have limited utility and some negative artifacts. Very few methods work within the main recirculation processing because any artifact, such as needle generation, will also recirculate. The prior art has not solved these problems. The prior art generally ignores the subject of statistical randomization, even though it is critically important.