Multimedia programs present data to a user through both audio and video events while a user interacts with a program via a keyboard, joystick, or other interactive input device. A user associates elements and occurrences of a video presentation with the associated audio representation. A common implementation is to associate audio with movement of characters or objects in a video game. When a new character or object appears, the audio associated with that entity is incorporated into the overall presentation for a more dynamic representation of the video presentation.
Audio representation is an essential component of electronic and multimedia products such as computer based and stand-alone video games, computer-based slide show presentations, computer animation, and other similar products and applications. As a result, audio generating devices and components are integrated with electronic and multimedia products for composing and providing graphically associated audio representations. These audio representations can be dynamically generated and varied in response to various input parameters, real-time events, and conditions. Thus, a user can experience the sensation of live audio or musical accompaniment with a multimedia experience.
Conventionally, computer audio is produced in one of two fundamentally different ways. One way is to reproduce an audio waveform from a digital sample of an audio source which is typically stored in a wave file (i.e., a .wav file). A digital sample can reproduce any sound, and the output is very similar on all sound cards, or similar computer audio rendering devices. However, a file of digital samples consumes a substantial amount of memory and resources for streaming the audio content. As a result, the variety of audio samples that can be provided using this approach is limited. Another disadvantage of this approach is that the stored digital samples cannot be easily varied.
Another way to produce computer audio is to synthesize musical instrument sounds, typically in response to instructions in a Musical Instrument Digital Interface (MIDI) file. MIDI is a protocol for recording and playing back music and audio on digital synthesizers incorporated with computer sound cards. Rather than representing musical sound directly, MIDI transmits information and instructions about how music is produced. The MIDI command set includes note-on, note-off, key velocity, pitch bend, and other methods of controlling a synthesizer.
The audio sound waves produced with a synthesizer are those already stored in a wavetable in the receiving instrument or sound card. A wavetable is a table of stored sound waves that are digitized samples of actual recorded sound. A wavetable can be stored in read-only memory (ROM) on a sound card chip, or provided with software. Prestoring sound waveforms in a lookup table improves rendered audio quality and throughput. An advantage of MIDI files is that they are compact and require few audio streaming resources, but the output is limited to the number of instruments available in the designated General MIDI set and in the synthesizer, and may sound very different on different computer systems.
MIDI instructions sent from one device to another indicate actions to be taken by the controlled device, such as identifying a musical instrument (e.g., piano, flute, drums, etc.) for music generation, turning on a note, and/or altering a parameter in order to generate or control a sound. In this way, MIDI instructions control the generation of sound by remote instruments without the MIDI control instructions carrying sound or digitized information. A MIDI sequencer stores, edits, and coordinates the MIDI information and instructions. A synthesizer connected to a sequencer generates audio based on the MIDI information and instructions received from the sequencer. Many sounds and sound effects are a combination of multiple simple sounds generated in response to the MIDI instructions.
A MIDI system allows audio and music to be represented with only a few digital samples rather than converting an analog signal to many digital samples. The MIDI standard supports different channels that can each simultaneously provide an output of audio sound wave data. There are sixteen defined MIDI channels, meaning that no more than sixteen instruments can be playing at one time. Typically, the command input for each channel represents the notes corresponding to an instrument. However, MIDI instructions can program a channel to be a particular instrument. Once programmed, the note instructions for a channel will be played or recorded as the instrument for which the channel has been programmed. During a particular piece of music, a channel can be dynamically reprogrammed to be a different instrument.
A Downloadable Sounds (DLS) standard published by the MIDI Manufacturers Association allows wavetable synthesis to be based on digital samples of audio content provided at run time rather than stored in memory. The data describing an instrument can be downloaded to a synthesizer and then played like any other MIDI instrument. Because DLS data can be distributed as part of an application, developers can be sure that the audio content will be delivered uniformly on all computer systems. Moreover, developers are not limited in their choice of instruments.
A DLS instrument is created from one or more digital samples, typically representing single pitches, which are then modified by a synthesizer to create other pitches. Multiple samples are used to make an instrument sound realistic over a wide range of pitches. DLS instruments respond to MIDI instructions and commands just like other MIDI instruments. However, a DLS instrument does not have to belong to the General MIDI set or represent a musical instrument at all. Any sound, such as a fragment of speech or a fully composed measure of music, can be associated with a DLS instrument.
Conventional Audio and Music System
FIG. 1 illustrates a conventional audio and music generation system 100. The audio system 100 includes two discrete components, DirectMusic® 102 and DirectSound® 104. DirectMusict® and DirectSound® are application programming interfaces (APIs) available from Microsoft Corporation, Redmond Wash. DirectSound® plays prerecorded digital samples, typically from wave files, and DirectMusic® plays synthesized audio in response to MIDI files or preauthored musical segments.
The audio system 100 includes a synthesizer 106 having a synthesizer channel 108. Typically, a synthesizer is implemented in computer software, in hardware as part of a computer's internal sound card, or as an external device such as a MIDI keyboard or module. The synthesizer channel 108 is an audio data or communications path that represents a destination for a MIDI instruction. The channel 108 has a left and right audio data output, and a reverb audio data output. The reverb output is input to a reverb component 110, and the left and right audio data outputs are input to a left or right input component 112 and 114, respectively. The output of the reverb 110 is a stereo pair that is also input to the left or right input component 112 and 114, respectively. The synthesizer output 116 is a stereo pair that is input to a mixing component 118.
A MIDI instruction, such as a “note-on”, directs a synthesizer 106 to play a particular note, or notes, on a synthesizer channel 108 having a designated instrument. The General MIDI standard defines standard sounds that can be combined and mapped into the sixteen separate instrument and sound channels. A MIDI event on a synthesizer channel corresponds to a particular sound and can represent a keyboard key stroke, for example. The “note-on” MIDI instruction can be generated with a keyboard when a key is pressed and the “note-on” instruction is sent to synthesizer 106. When the key on the keyboard is released, a corresponding “note-off” instruction is sent to stop the generation of the sound corresponding to the keyboard key.
The audio system 100 includes a buffer component 120 that has multiple buffers 122(1 . . . n). The output of the mixing component 118 associated with synthesizer channels 108 is input to one buffer 122(2) in the buffer component 120. A buffer in this instance is typically an allocated area of memory that temporarily holds sequential samples of audio data that will be subsequently delivered to an audio rendering device such as a speaker.
An application program typically communicates with synthesizer 106 via some type of dedicated communication interface, commonly referred to as an API. In the audio system 100, an application program delivers audio content or other music events to the synthesizer 106. The audio content and music events are represented as data structures containing information about the audio content and music events such as pitch, relative volume, duration, and the like. Audio events are message-based data, such as MIDI files or musical segments, authored with an external device.
Sound effects can be implemented with a synthesizer, but the output is constrained to the stereo pair 116. For music generation, only having the ability to process audio in a synthesizer can be sufficient. In an audio system that supports both music and sound effects, however, a single output pair input to one buffer is a limitation to creating and enhancing the sound effects.