1. Field of the Invention
This invention is related to the field of computer systems which perform sound synthesis and, more particularly, to a computer system which generates delay-based sound effects by using system memory to perform the function of a delay element.
2. Description of the Relevant Art
Personal computer (PC) audio systems have traditionally employed a technique called Frequency Modulation (FM) synthesis to generate audio sounds. FM synthesis works by combining the outputs of multiple sine wave oscillators which are relatively close in frequency to produce complex sound waves with close-to-natural timbres, attacks and delays. An advantage of FM synthesis is that it is relatively inexpensive to implement. A disadvantage is that FM synthesized sounds are generally recognizable as synthesized sounds.
A new music synthesis method, wavetable music synthesis, has the advantage of producing more life-like sounds than FM synthesis. Wavetable music synthesizers store digitally sampled audio data in digital memory. Thus, in wavetable synthesis, samples of actual audio are used to create sounds, as opposed to synthesizing sine waves in FM synthesis. Typically, wavetable synthesizers do not store a sample of each note which the instrument is capable of playing. Rather, to minimize the memory requirement, wavetable synthesizers typically store samples of a few representative notes of the instrument For example, a wavetable music synthesizer might store eight of the eighty-eight possible notes of a piano. Wavetable synthesizers then retrieve one of these stored data samples, shift the pitch of the sampled data to the desired new pitch, and then perform digital-to-analog conversion on the new data so that an analog device such as a speaker or headphone can reproduce the original sound. Often many audio sources, also known as voices, are sampled and stored in memory. Examples of such voices are musical instruments and human voices. A collection of samples of one or more voices is commonly referred to as wavetable data.
The quality of the music generated in either of the manners described above can often be improved when some of the voices are processed with delay-based audio effects. Examples of delay-based audio effects are echo, reverb, chorus, and flange. The echo effect imitates the delayed version of a sound that results from reflection from a large object. The reverb effect imitates the many delayed and distorted versions of a sound that result from many echoes bouncing back and forth in a small enclosed space of high acoustic reflectivity. The chorus effect imitates the not-quite-simultaneous repeated versions of a sound that results from many sound sources acting in concert. The flange effect imitates the slow decay of a sound that results from a sound propagating in multiple paths from the source to the listener.
To create these effects it is necessary to provide a method for producing delayed versions of the audio output. The conventional method for doing this is to store the audio samples in a queue in memory. The queue then functions as a time-delay element to provide a time-delay data stream.
The cost of having a dedicated memory to store time-delayed data samples could be eliminated by using the personal computer's system memory to store time-delay data. Applicant is aware of various unified memory architectures which attempt to store video data in the main or system memory. The following U.S. Patent applications disclose a system for using system memory for storing wavetable data:
U.S. patent application Ser. No. 08/621,397, filed Mar. 25, 1996, and titled "Computer system and method for performing wavetable music synthesis which stores wavetable data in system memory"
U.S. patent application Ser. No. 08/623,850, filed Mar. 25, 1996, and titled "Computer system and method for generating delay based audio effects in a wavetable music synthesizer which stores wavetable data in system memory"
U.S. patent application Ser. No. 08/622,471, filed Mar. 25, 1996, and titled "Computer system and method for performing wavetable music synthesis which stores wavetable data in system memory employing a high priority I/O bus request mechanism for improved audio fidelity"
U.S. patent application Ser. No. 08/622,761, filed Mar. 25, 1996, and titled "Computer system and method for performing wavetable music synthesis which stores wavetable data in system memory which minimizes audio infidelity due to wavetable data access latency"
This approach has generally had performance penalties, however, because of bandwidth and bus mastering issues associated with the system bus.
In the past, personal Computer system I/O buses have not provided enough bandwidth for a unified memory architecture implementation. A typical Industry Standard Architecture (ISA) bus implementation, for example, is only capable of sustaining a bandwidth of a few megabytes per second. With the advent of the Peripheral Component Interconnect (PCI) bus, this problem has been substantially reduced in that PCI bus implementations are capable of sustaining on the order of 100 MB/second.
The PCI bus, however, introduces some additional problems. Specifically, PCI is tied very closely with the PC's CPU. As a result the PCI bus has been optimized around the burst nature of refilling the CPU's cache memory. Further, the latency involved in gaining control of the PCI bus once a request for bus mastership is generated is both significant and indeterminate. PCI bus master latency is typically 2-3 microseconds, often 20-30 microseconds, and delays as long as 100-200 microseconds are possible. Thus the PCI bus is not ideal for isochronous or real-time transfers.
A typical sound DSP can have multiple voices active simultaneously. The number of simultaneous active voices is referred to as the polyphony of the DSP. A sound DSP operates as a Digital Signal Processor (DSP) system, and as such has an associated sample rate hereinafter called the frame rate, which we will assume is 44,100 frames per second. During each frame time, which is the reciprocal of the frame rate (22.7 microseconds at a frame rate of 44,100 frames per second), the DSP must calculate a new output value for each of the active voices (up to 32 in our example). Assuming the polyphony is 32, this implies that the DSP hardware must process up to 44,100.times.32=1,411,200 voice outputs per second. The data samples are typically one byte or two bytes wide.
When performing digital-to-analog (D/A) conversion on sampled audio data, the data samples are supplied to a D/A converter. Each data sample has an associated arithmetic value which is supplied to the D/A converter. A ramp rate, or slope, exists between the arithmetic values of any two consecutive samples Audible artifacts, such as a "pop" from the speaker or other audio output device, are heard in the reproduced sound if two consecutive samples of audio data are supplied to the D/A converter which have a slope beyond a maximum value. These audible artifacts are commonly referred to as "zipper noise".
When D/A converters are not supplied with a sample value at their clock edge, i.e., not supplied at the required sample rate, in this case at or above the Nyquist frequency, the D/A converter can interpret the value as either the minimum or maximum arithmetic value receivable by the D/A converter. Hence, if samples are not supplied to an audio D/A converter on time there exists a high probability of creating unwanted pops and clicks.