1. Field of the Invention
Embodiments of the invention described herein pertain to the field of audio/video synchronization systems. More particularly, but not by way of limitation, one or more embodiments of the invention enable an apparatus and method for synchronizing an secondary audio track to the audio track of a video source for example.
2. Description of the Related Art
There is no known apparatus or method for automatically synchronizing a secondary audio track to an audio track of a video source. There are various ways to manually perform synchronization between two audio streams that involve synching the two audio sources based on time (which may be running at a slightly different rate in each source), frame count or I frames in the case of MPEG. However, there is often drift of synch between the two sources. This is particularly evident in the case of DVD players which vary slightly in speed and other factors inherent in the multitude of player models as well as the form of compression and parameters of the DVD or other source. Indeed a secondary source might include various versions that were created using different compression codecs each with slightly different timing.
There are at least two ways to utilize a secondary audio track with a video source such as a DVD. First, the secondary audio track can be played separately from the DVD (for example a rented DVD) and adjusted manually while playing the secondary audio track, for example on an MP3 player coupled with speakers. This requires adjusting the playback of the secondary audio track to keep the secondary audio track in synchronization with the DVD that is playing. If the DVD is paused, the secondary audio track must be paused at the same time and both sources must be started again at the same time when resuming play. Synchronization, may be slightly off when resuming play, so the secondary audio track timing must be adjusted again to ensure synchronization.
The second manner in which to utilize a secondary audio track with a video source requires combining the secondary audio track with the audio track of the video source. The current process for combining a secondary audio track with a video source such as a DVD is an extremely technical manual process. The process requires several software tools to perform the required steps. For example, one scenario begins when a DVD is purchased by a user. The user decides to add humorous commentary to the DVD. The commentary is obtained from “RiffTrax.com” a company that specializes in secondary audio track generation and features commentary tracks from the original writers of “Mystery Science Theatre 3000”. The DVD is “ripped” with “DVD Decrypter” or “rejig”. The audio from the DVD is adjusted with “delaycut”. The DVD Audio files are converted to WAV files with “PX3Convert”. The WAV files are manually synched using “Audacity” with a secondary audio track, i.e., the “Riff Track”. The resulting WAV file is converted with “ffmpegGUI” back to DVD format audio (i.e., AC3). The DVD format audio is added to the DVD video and converted to a single file with “Ifoedit” or “rejig”. The single file is then burned onto a DVD with “DVDShrink”.
The forementioned steps each break down into a very technical sub-steps. For example, ripping the files using “rejig” requires the following sub-steps. First, a folder is created on the user's desktop where the work will be performed. After creating the folder, the user inserts the DVD into the computer. The “rejig” program is run. The “rejig” setting are set to “IFO Mode” in the “Settings” and “old engine” is selected. The AC3 Delay box is checked along with any desired foreign language or subs. The output directory folder is selected. Next the “ChapterXtractor” is asserted which obtains the chapter times for the DVD. The user is required to edit the chapter times to remove “chapter 1=”, “chapter 2=”, etc., from the front of each line of the output file leaving one number per line. The one number per line represents the time offsets to each chapter in numeric format. The synchronizing step using “Audacity” uses the following sub-steps. Both the secondary audio track and the audio track of the video are loaded into “Audacity”. The secondary audio track is then cut until the start of the movie lines up with the proper starting point of the secondary audio as indicated in a README file supplied with the secondary audio track. The amount of time to cut is approximate and is used a guideline to obtain a good first cut at synchronization. The sound level of the secondary audio track is adjusted to make sure that it is loud enough for simultaneous playback with the audio track of the video. The process of cutting away or adding time to the secondary audio continues throughout the playing of the video and is checked for synchronization every few minutes to ensure synchronization is correct. When synchronization is off, the secondary audio track timing is adjusted either by advancing or delaying the secondary audio track, or by slowing down or speeding up the secondary audio track. Although two steps of the main process have been described in more detail, the other steps not broken into sub-steps likewise have many pitfalls and are “expert friendly” at best.
As discussed, the technical competency required to create a “riffed DVD” is extremely high. Certain users have found that running alternate tools such as “Delaycut” must be utilized even if the ac3 file indicates a delay of “0 msec”. If using the “goldwave” plugin, then fade-in and fade-out time must be allowed for. These steps put the generation process out of reach for normal users. In addition, although tools such as “sharecrow” have planned features that allow for speeding up and slowing down individual sections of audio, the entire process itself is still manual and highly technical. Other users have reported problems with synchronization when their computers do not have adequate memory, hence having a very capable computer is another requirement for performing the process.
Although the technical competency required to create a “riffed DVD” is very high, the paramount problem is maintaining synchronization between the video and the secondary audio track. There are many reasons why the secondary audio track goes out of synchronization with the DVD.
One reason for loss of synchronization has to do with different versions of a particular movie. For example movies sold in certain countries are required to have scenes deleted, for example violent scenes removed. Hence, there are points through the video where the secondary audio track no longer synchs with the video. For example, the PAL version of the movie “The Matrix” sold in the United Kingdom has synching issues at the point where a main character becomes quite violent. Hence depending on where a DVD is sold, different secondary audio synchronization timings must be employed to synchronize with the remaining portion of the video.
Another reason for loss of synchronization has to do with “drift”. Framerate is a main cause of drift related problems. This requires checking the video framerate to ensure no compression is utilized prior to synching and ensuring that the right file types are utilized. For example, if the secondary audio track synchs properly with the video when watching the video on another piece of hardware, then the synch issues are certainly related to one of the steps utilized when reauthoring on the PC. The authoring process is simply too complex with too many variables to allow for trivial synchronization. Another cause of drift has to do with certain DVD players running slightly slower or faster than at a standard rate. Hence no absolute time starting offsets can be utilized, since synchronization drifts while a video plays and must be adjusted throughout the video using the manual steps previously described.
Another reason for loss of synchronization has to do with ambiguous synchronization lines in the movie. For example, in the movie “the Fifth Element”, the sixth synchronization line “You have one point on your license” is spoken twice in the movie, once by a computer voice and once by an actor's voice. This causes confusion among users attempting to add the secondary sound track to the video.