A “video track”, as used herein, refers to an ordered sequence of visual events represented by any time based visual media, where each such event (hereinafter, “video” event) can be specified by a timing offset from a video start time. A video event can constitute any moment deemed to be visually significant.
An “audio track”, as used herein, refers to an ordered sequence of audible events represented by any time based audible media, where each such event (hereinafter, “audio” event) can be specified by a timing offset from an audio start time. A audio event can constitute any moment deemed to be audibly significant.
It is often desirable to produce an audio track, e.g., music, to accompany a video track, e.g., a TV commercial or full length film. When bringing video and audio together, the significant events in the respective tracks must be well synchronized to achieve a satisfactory result.
When composing original music specifically for a video track, it is common practice to compile a list of timing offsets associated with important video events and for the composer to use the list to create music containing correspondingly offset music events. Composing original music to accompany a video is quite costly and time consuming and so it has become quite common to instead manipulate preexisting, i.e., prerecorded, music to synchronize with a video track. The selection of appropriate prerecorded music is a critical step in the overall success of joining video and audio tracks. The genre, tempo, rhythmic character and many other musical characteristics are important when selecting music. But, beyond the initial selection, the difficulty of using prerecorded music is that its audio/music events will rarely align with the video events in the video track. Accordingly, a skilled human music editor is typically employed to select suitable music for the video and he/she then uses a computer/workstation to edit the prerecorded music. Such editing typically involves interactively shifting music events in time generally by removing selected music portions to cause desired music events to occur sooner or by adding music portions to cause desired music events to occur later. Multiple iterative edits may be required to alter the prerecorded music to sufficiently synchronize it to the video track and great skill and care is required to ensure that the music remains aesthetically pleasing to a listener. Various software applications (e.g., Avid Pro Tools, Apple Soundtrack, SmartSound Sonicfire Pro, Sony Vegas, Sync Audio Studios Musicbed) have been released to facilitate the editing of prerecorded music. Such applications generally provide a user interface offering the user a means to visualize the timing relationship between a video track and a proposed audio track while providing tools to move or transform items in the audio tracks. The standard approach is for the editor to repeatedly listen to the source music to acquaint himself with its form while also listening for musical events that can be utilized to effectively enhance the video events in the video track. The process is largely one of trial and error, using a “razor blade” tool to cut the music into sections and subsequently slide the sections backwards or forwards to test the effectiveness of the section at that timing. Once a rough arrangement of sections is determined, additional manual trimming and auditioning of the sections is generally required to make the sections fit together in a continuous stream of music. The outlined manual process is very work intensive and requires professional skill to yield a musically acceptable soundtrack.
An alternative method utilized by a few software applications involves adjusting the duration of a musical composition or user defined sub-section by increasing or decreasing the rate (i.e., tempo, beats per minute) at which the media is played. If the tempo is increased/decreased a uniform amount for the entire musical composition, then it is true that the timing for which a single musical event occurs can be adjusted relative to the beginning of the music, but it is mathematically unlikely that multiple music events will align with multiple video events via a single tempo adjustment. Additionally, only small timing adjustments are practical to avoid degrading the recording of the music.