1. Field of the Invention
This invention relates to the field of computer software and hardware for generating and manipulating audio data. The invention relates more specifically to a method for storing audio data in a file format that lends itself to advanced audio manipulation of audio.
2. Background Art
Whether it is for developing a video game or producing a movie, creative artists and sound engineers often seek to produce audio effects that match visual scenes, or adjust and synchronize different musical tracks to form a single cohesive composition. With the help of computers, these users are able to generate sounds, compose music or alter existing audio to produce the desired effects. Artists may also exchange audio data or procure libraries of sounds and music loops to expand their creative capabilities.
To alter a sound or music loop, the artist may use software applications and/or electronic music hardware that provide tools for loading audio data and applying one or more computation techniques to produce a desired effect. Audio data are typically stored in a digital format (e.g. Compact Disks and MP3 formats) in the form of numerical values representing sampled audio waveforms at a chosen sample rate, typically twice the highest sound frequency audible to humans (around 24,000 Hz). Synthesized (i.e., artificially generated) music on the other hand can be stored in encoded form, which allows a player to read specific instructions and render the music by synthesizing the encoded notes.
Existing applications for audio data manipulation (e.g. computer software applications and dedicated hardware such as music synthesizers) provide numerous tools for manipulating audio data. Some applications provide tools for directly manipulating audio waveforms, typically by carrying out complex computations on the audio signal. Such tools include the use of signal analysis algorithms such as frequency domain spectral analysis(e.g., for the application of digital filters) and amplitude analyses (e.g., for determining transients and segments of interest in a waveform).
While the latter techniques are successful at producing many desired audio effects, they are often limited both in the physical range of manipulation and the type of manipulation(s) that can be performed. For example, every alteration of the spectrum of an audio track is likely to introduce audible frequency distortions that may be unacceptable to the human ear in some range limits. Other alterations, such as segment deletion, may also produce undesirable effects. For example, after deleting a music segment, the reverberation associated with a sound in the deleted segment can still be heard in the remaining segments that follow, even though the junction between remaining segments is properly corrected and made unnoticeable to the human ear. In other instances some desired effects may not be possible using the above tools. For example, one may not be able to extract an instrument from a recorded music performance, or apply changes to specific music notes in a performance.
Other existing applications for generating and manipulating audio data store and process music data in an encoded form. Music data encoded in the MIDI (Musical Instrument Digital Interface) format, for example, provides information about note events, the instrument(s) used to make a particular note, and the acoustic playback environment (e.g. the ambiance in which the composer intends for the music to be heard). When played back, the instrument library of the player device is used to generate the musical output. These instrument libraries can vary in performance from device to device, and even from version to version of the same device (e.g., the sound of a synthesized trumpet may vary from system to system. Therefore, a user may play the MIDI encoded music with only a limited fidelity to the original composition.
There is a dichotomy in current applications providing audio handling tools. Users are either forced to work on a pre-constructed signal (e.g., a sampled signal) that offers the original characteristics of the sound, or they are forced to work with encoded compositions without having access to the original sounds (i.e., as the author intended them to be heard). There is a lack of audio file formats that enable applications to play back audio data with the highest fidelity to the original composition, while providing access to the original recorded audio and allowing for manipulation of the audio data.
Therefore, there is a need for an audio format that provides audio data in a way that allows for reconstructing audio effects and manipulating audio data, while preserving the quality of sound of the original composition.