Multimedia programs present data to a user through both audio and video events while a user interacts with a program via a keyboard, joystick, or other interactive input device. A user associates elements and occurrences of a video presentation with the associated audio representation. A common implementation is to associate audio with movement of characters or objects in a video game. When a new character or object appears, the audio associated with that entity is incorporated into the overall presentation for a more dynamic representation of the video presentation.
Audio representation is an essential component of electronic and multimedia products such as computer based and stand-alone video games, computer-based slide show presentations, computer animation, and other similar products and applications. As a result, audio generating devices and components are integrated with electronic and multimedia products for composing and providing graphically associated audio representations. These audio representations can be dynamically generated and varied in response to various input parameters, real-time events, and conditions. Thus, a user can experience the sensation of live audio or musical accompaniment with a multimedia experience.
Conventionally, computer audio is produced in one of two fundamentally different ways. One way is to reproduce an audio waveform from a digital sample of an audio source which is typically stored in a wave file (i.e., a .wav file). A digital sample can reproduce any sound, and the output is very similar on all sound cards, or similar computer audio rendering devices. However, a file of digital samples consumes a substantial amount of memory and resources for streaming the audio content. As a result, the variety of audio samples that can be provided using this approach is limited. Another disadvantage of this approach is that the stored digital samples cannot be easily varied.
Another way to produce computer audio is to synthesize musical instrument sounds, typically in response to instructions in a Musical Instrument Digital Interface (MIDI) file. MIDI is a protocol for recording and playing back music and audio on digital synthesizers incorporated with computer sound cards. Rather than representing musical sound directly, MIDI transmits information and instructions about how music is produced. The MIDI command set includes note-on, note-off, key velocity, pitch bend, and other methods of controlling a synthesizer.
The audio sound waves produced with a synthesizer are those already stored in a wavetable in the receiving instrument or sound card. A wavetable is a table of stored sound waves that are digitized samples of actual recorded sound. A wavetable can be stored in read-only memory (ROM) on a sound card chip, or provided with software. Prestoring sound waveforms in a lookup table improves rendered audio quality and throughput. An advantage of MIDI files is that they are compact and require few audio streaming resources, but the output is limited to the number of instruments available in the designated General MIDI set and in the synthesizer, and may sound very different on different computer systems.
MIDI instructions sent from one device to another indicate actions to be taken by the controlled device, such as identifying a musical instrument (e.g., piano, flute, drums, etc.) for music generation, turning on a note, and/or altering a parameter in order to generate or control a sound. In this way, MIDI instructions control the generation of sound by remote instruments without the MIDI control instructions carrying sound or digitized information. A MIDI sequencer stores, edits, and coordinates the MIDI information and instructions. A synthesizer connected to a sequencer generates audio based on the MIDI information and instructions received from the sequencer. Many sounds and sound effects are a combination of multiple simple sounds generated in response to the MIDI instructions.
A MIDI system allows audio and music to be represented with only a few digital samples rather than converting an analog signal to many digital samples. The MIDI standard supports different channels that can each simultaneously provide an output of audio sound wave data. There are sixteen defined MIDI channels, meaning that no more than sixteen instruments can be playing at one time. Typically, the command input for each channel represents the notes corresponding to an instrument. However, MIDI instructions can program a channel to be a particular instrument. Once programmed, the note instructions for a channel will be played or recorded as the instrument for which the channel has been programmed. During a particular piece of music, a channel can be dynamically reprogrammed to be a different instrument.
A Downloadable Sounds (DLS) standard published by the MIDI Manufacturers Association allows wavetable synthesis to be based on digital samples of audio content provided at run time rather than stored in memory. The data describing an instrument can be downloaded to a synthesizer and then played like any other MIDI instrument. Because DLS data can be distributed as part of an application, developers can be sure that the audio content will be delivered uniformly on all computer systems. Moreover, developers are not limited in their choice of instruments.
A DLS instrument is created from one or more digital samples, typically representing single pitches, which are then modified by a synthesizer to create other pitches. Multiple samples are used to make an instrument sound realistic over a wide range of pitches. DLS instruments respond to MIDI instructions and commands just like other MIDI instruments. However, a DLS instrument does not have to belong to the General MIDI set or represent a musical instrument at all. Any sound, such as a fragment of speech or a fully composed measure of music, can be associated with a DLS instrument.
Conventional Audio and Music System
FIG. 1 illustrates a conventional audio and music generation system 100 that includes a synthesizer 102 and two MIDI inputs 104 and 106. Typically, a synthesizer is implemented in computer software, in hardware as part of a computer's internal sound card, or as an external device such as a MIDI keyboard or module. Conventionally, a synthesizer 102 receives MIDI inputs on sixteen channels 108(1–16) that conform to the MIDI standard. The inputs are in the form of individual instructions, each of which designates the channel to which it applies. Within the synthesizer 102, instructions associated with different channels are processed in different ways, depending on the programming for the various channels.
A MIDI instruction, such as a “note-on”, directs a synthesizer 102 to play a particular note, or notes, on a synthesizer channel 108 having a designated instrument. The General MIDI standard defines standard sounds that can be combined and mapped into the sixteen separate instrument and sound channels. A MIDI event on a synthesizer channel corresponds to a particular sound and can represent a keyboard key stroke, for example. The “note-on” MIDI instruction can be generated with a keyboard when a key is pressed and the “note-on” instruction is sent to synthesizer 102. When the key on the keyboard is released, a corresponding “note-off” instruction is sent to stop the generation of the sound corresponding to the keyboard key.
A MIDI input is typically a serial data stream that is parsed in the synthesizer into MIDI commands and synthesizer control information. A MIDI command or instruction is represented as a data structure containing information about the sound effect or music piece such as the pitch, relative volume, duration, and the like. The output of a synthesizer channel 108 is a sound waveform that is mixed and input to a buffer (not shown). A buffer in this instance is typically an allocated area of memory that temporarily holds sequential samples of audio wave data that will be subsequently delivered to a sound card or similar audio rendering device to produce audible sound.
The MIDI input 104 has a sound effect instruction 110 to generate a dog bark sound on MIDI channel 1 in synthesizer 102. The MIDI input 106 is a music piece having instructions 112(1–3) to generate musical instrument sounds. Instruction 112(1) designates that a flute sound be generated on MIDI channel 1, instruction 112(2) designates that a horn sound be generated on MIDI channel 2, and instruction 112(3) designates that drums be generated on MIDI channel 10 in synthesizer 102.
The MIDI channel assignments are designated when the MIDI inputs 104 and 106 are authored, or created. The limited number of available MIDI channels in a synthesizer results in the problem of one input overriding and canceling out another input, or the problem of overlapping content when playing back the designated sounds of MIDI inputs. For example, channel 108(1) in synthesizer 102 receives two inputs at the same time—instruction 110 to generate the dog bark sound effect and instruction 112(1) to generate a flute sound. The synthesizer 102 might first receive the flute instruction 112(1), then the dog bark instruction 110 which overrides the first input, and then an associated flute instruction to play a particular note. The undesirable output is a flute note that is played as a dog bark.
A conventional software synthesizer that translates MIDI instructions into audio signals does not support distinctly separate sets of MIDI channels. The number of sounds that can be played simultaneously is limited by the number of channels and resources available in the synthesizer. In the event that there are more MIDI inputs than there are available channels and resources, one or more inputs are suppressed by the synthesizer.
Another problem with having only a limited number of synthesizer channels is that content intended to be played multiple times for the same sound effect, such as two dogs barking, cannot be faded one over the other. A pre-authored dog bark sound effect is assigned to a designated MIDI channel, as with input 104, for example. Rather than being able to play two distinct dog barks from the same source at or near the same time, two instances of the sound effect will be input to synthesizer channel 1 and only one dog bark will be rendered as audible sound at one time. This also precludes initiating two of the same sound effect and fading one over the other.