There are numerous multi-media editing applications today for editing and creating multimedia presentations. Examples of such applications include Final Cut Pro® and iMovie®, both sold by Apple Computer, Inc. Multi-media editing applications allow a user to prepare a multi-media presentation by combining several video clips and audio items. They often allow the user to combine video and audio effects with imported video clips and audio items in order to enhance the multi-media presentation.
These applications often provide the user with several digital signal processing (“DSP”) operations. The user can then select a DSP operation to apply an audio effect to an audio item that the user designates. These applications also provide DSP operations for adjusting the sample rate of an audio item. In this document, DSP operations and DSP's that apply an audio effect to an audio item are called filter operations and filters, while DSP operations and DSP's that adjust the sample rate of an audio item are called sample rate conversion (“SRC”) operations and sample rate converters. Media editing application at times also provide the user with a mixer that allows the user to specify the volume level and pan (where pan is the audio distribution among different channels) of the audio items.
In media editing applications, it is often desirable to allow the user to process the composed media in real time, since real-time processing allow a user to instantaneously gauge the effect that can be achieved by adjusting certain parameters, and hence is an important part of the creative process. However, real-time processing often becomes more difficult as the user adds more content and specifies more DSP and mixing operations. Often, real-time processing is impossible once the user adds a few tracks and operations, since audio and video processing is quite intensive, and this processing becomes more intensive as DSP and mixing operations are added.
Therefore, there is a need to reduce real-time transactions in order to allow the user to have some real-time processing capabilities. FIG. 1 illustrates one prior art way for reducing the number of real-time transactions. The approach shown in this figure is called sequence level rendering. This figure illustrates the processing 115 of several audio tracks 1 to N. This processing entails retrieving audio data from a source file, performing any necessary sample rate conversion, and applying any necessary filter operations. After this processing for each track, a mixer 105 combines the processed data of all the tracks and assigns the combined data to different audio channels (e.g., left and right for a stereo presentation) while accounting for the appropriate volume and pan values specified through the mixer. The output of the mixer is then played through the speakers 110.
As illustrated in FIG. 1, sequence level rendering is a preprocess operation that composites the processing of all the tracks and the mixing into one render file 120. By pre-processing all the operations and mixing of the audio tracks, this operation frees up the CPU from having to do these operations in real-time. Hence, the thought behind this approach is to have the user create one render file once the user is satisfied with the settings and arrangement of certain audio items.
However, this approach has several disadvantages. First, the user cannot modify the DSP operations previously specified for a track without discarding the rendering file and re-rendering the tracks 1 to N. Such re-rendering is time consuming, defeats the purpose of preprocessing, and detracts from the creative process. Second, the rendering file produced through sequence level rendering has to be discarded when the user wants to move one of the tracks 1 to N.
Third, the rendering file 120 has to be discarded even when the user moves all of the pre-processed tracks in unison. This is because video and audio data rates are not integer multiples of each other, which requires the audio samples per frame to be uniquely specified based on the position of the frame in the presentation and with respect to the audio data's source file. Sequence level rendering cannot address the issue of the sample count being specifically tailored for the position of the frame in the presentation. Therefore, there is a need in the art for method for pre-processing individual audio items in a media project in order to improve real-time processing of the media project.