Various methods and systems for synchronizing and mixing video and audio media are known in the art. For example, to make a music video clip, professional recording studios commonly record different video and sound tracks at different times and then overlay and intercut them to create the final product. Systems of this sort use costly, specialized equipment, under the control of expert operators.
A number of methods have been suggested for simplifying the mixing of video and audio media from different sources. For example, U.S. Patent Application Publication 2010/0211876, whose disclosure is incorporated herein by reference, describes systems and methods for casting calls. A casting call is generated based on information provided by an individual (e.g., a casting call manager). The casting call may indicate a particular video clip and designates a recipient for submissions related to the casting call. A user interested in participating in the casting call may submit a query. In response to the query, the user is provided with access to the video clip for modification. Such a modification may involve incorporating a recording of a performance into the video clip. As a result, the modified video clip may be generated whereby the user becomes the “actor” in the modified video clip.
As another example, U.S. Patent Application Publication 2005/0042591, whose disclosure is incorporated herein by reference, describes methods and apparatus for use in sound replacement with automatic synchronization to images. Digital audio and video files are created corresponding to selected scenes from a creative production and are provided with a processing system that enables dialog to be selected from a scene and replaced by a user's dialog, which is automatically synchronized with the original dialog so as to be in synchronism with lip movements displayed by the accompanying video display. The processing further includes a graphical user interface that presents the user with the video, the text of the dialog, and cues for rehearsal and recording of replacement dialog by the user. Replay of the user's dialog is accompanied by the video and part of the original audio except that the original dialog corresponding to the user's dialog is muted so that the user's dialog is heard as a replacement. Singing or other sounds associated with visible action may also be replaced by the same processes.
U.S. Pat. No. 7,821,574, whose disclosure is incorporated herein by reference, describes a method for synchronizing an audio stream with a video stream. This method involves searching in the audio stream for audio data having values that match a distinct set of audio data values and synchronizing the audio stream with the video stream based on the search. In some embodiments, the distinct set of audio data values is defined by a predetermined distinct tone. In other embodiments, the distinct set of audio data values is defined by audio data contained in the video stream.