Maintaining synchronization between audio and video on a modern digital television is dependent on a number of variables. The sync can depend on, for example, the amount of computing performed for motion estimation and motion compensation (MEMC). Audio data and video data in a content stream are tied or synchronized together through timing codes contained in both the audio frames and associated video frames. Video processing however can introduce delays in the presentation of a video frame that synchronized with an audio frame. MEMC processing typically causes the audio frame to be played before the video frame is displayed. This synchronization error is referred to as “lip sync” delays.
The inventor has recognized the need to reduce or eliminate a timing error between when an audio frame is played and a video frame is displayed.
Modern digital televisions receive program content from various sources such as over the air (ATSC transmitted), program content received over IP (Internet Protocol), from local devices such as cable networks where set top boxes receive, decode and pass on digital content, satellite receivers that operate much the same as cable set top boxes, local multimedia player devices such as DVD players, game consoles where DVDs or other media is played and sent to televisions over wired and wireless communications links.
Because most program content is recorded/captured at a specified frame rate, generally 30 frames per second, there is a perceivable difference between watching the content on a movie screen where each frame is typically shown three times and watching the same content on a television.
Video content for television is typically recorded at 30 frames per second. This can result in what is known as juddering which expresses itself as a jerky motion of objects as they move across the screen. In order to present a fluid motion to objects on the screen, the video will be played back at 60, 120 or 240 frames per second depending on the television.
The technology for displaying more frames per second than contained in the original 30 frames per second can typically create additional frames between the original frames. The computations necessary to create the additional frames is referred to as Motion Estimation Motion Compensation or MEMC. MEMC allows additional frames to be created and to preserve fluid motion.
Audio recorded on movie film is typically recorded and preserved in one or more sound tracks immediately adjacent to the frame the audio is associated with. When movie content is digitized, the video and audio frames have synchronization values that tie a video frame to an audio frame.
Most digital televisions have what is referred to as “game mode” where gaming video content is displayed at its original frame rate. In this mode, the video being displayed is not processed to preserve fluid motion and may display juddering but the audio will always be in sync with the video.
When normal video is being displayed at, for example, 120 frames per second, there are four times as many frames being displayed (30 frames per second×4) as being received. Digital televisions displaying video at this rate will generally delay the audio by some fixed amount so that the audio and video do not get out of sync. Once the delay for creating the additional frames is known, computed or measured, a delay can be introduced into the audio stream processing such that the audio and video are in synchronization. This delay however is only valid for the audio processing inside the television.
Audio delay times may also be effected by home theater equipment where analog or digital sound may be sent from the television to the external equipment. In this case, the amount of processing performed by the external audio equipment is unknown and in many cases introduces a delay in the sound causing a video frame to be displayed before the associated audio is played.
Some television manufacturers will introduce a fixed “lip sync delay time” before audio frames are processed and played. Fixed delay times do not completely fix the synchronization problem between the processed video and the audio as discussed above relating to external sound equipment.