1. Field of the Invention
The present invention generally relates to the art of audio/video data compression and transmission, and more specifically to a synchronization system for a Motion Picture Experts Group (MPEG) audio/video decoder using audio subframe skip and repeat.
2. Description of the Related Art
Constant efforts are being made to make more effective use of the limited number of transmission channels currently available for delivering video and audio information and programming to an end user such as a home viewer of cable television. Various methodologies have thus been developed to achieve the effect of an increase in the number of transmission channels that can be broadcast within the frequency bandwidth that is currently allocated to a single video transmission channel. An increase in the number of available transmission channels provides cost reduction and increased broadcast capacity.
The number of separate channels that can be broadcast within the currently available transmission bandwidth can be increased by employing a process for compressing and decompressing video signals. Video and audio program signals are converted to a digital format, compressed, encoded and multiplexed in accordance with an established compression algorithm or methodology.
The compressed digital system signal, or bitstream, which includes a video portion, an audio portion, and other informational portions, is then transmitted to a receiver. Transmission may be over existing television channels, cable television channels, satellite communication channels, and the like.
A decoder is provided at the receiver to de-multiplex, decompress and decode the received system signal in accordance with the compression algorithm. The decoded video and audio information is then output to a display device such as a television monitor for presentation to the user.
Video and audio compression and encoding is performed by suitable encoders which implement a selected data compression algorithm that conforms to a recognized standard or specification agreed to among the senders and receivers of digital video signals. Highly efficient compression standards have been developed by the Moving Pictures Experts Group (MPEG), including MPEG 1 and MPEG 2. The MPEG standards enable several VCR-like viewing options such as Normal Forward, Play, Slow Forward, Fast Forward, Fast Reverse, and Freeze.
The MPEG standards outline a proposed synchronization scheme based on an idealized decoder known as a Standard Target Decoder (STD). Video and audio data units or frames are referred to as Access Units (AU) in encoded form, and as Presentation Units (PU) in unencoded or decoded form. In the idealized decoder, video and audio data presentation units are taken from elementary stream buffers and instantly presented at the appropriate presentation time to the user. A Presentation Time Stamp (PTS) indicating the proper presentation time of a presentation unit is transmitted in an MPEG packet header as part of the system syntax.
The presentation time stamps and the access units are not necessarily transmitted together since they are carried by different layers of the hierarchy. It is therefore necessary for the decoder to associate the presentation time stamp found at the packet layer with the beginning of the first access unit which follows it.
The situation is further complicated by the fact that in a real decoder the system has little control over the presentation times of the presentation units. For example, in the video decoder, video frames (pictures) must be presented at an exact multiple of the frame rate for the video to appear smooth, and the audio frames must be presented at exact multiples of the audio frame rate for the audio to be free of clicks.
In the idealized MPEG synchronization scheme, a system time clock (STC) which maintains a system clock time is provided in the decoder. The initial value of the system clock time is transmitted in the system stream by the encoder as a System Clock Reference (SCR) in an MPEG 1 bitstream, or as a Program Clock Reference (PCR) in an MPEG 2 bitstream. The decoder sets its local system time clock to the initial value, and then continues to increment it at a clock rate of 90 kHz.
Subsequently, the encoder transmits a presentation time stamp for an audio or video access unit, followed some time later by the access unit itself. The decoder compares the presentation time stamp to the local system clock time, and when they are equal removes the access unit from the elementary stream buffer, instantly decodes it to produce the corresponding presentation unit, and presents the presentation unit.
In a real system, synchronization is complicated by factors including the following.
1. Presentation units cannot be removed from the elementary stream buffer instantaneously, nor decoded or presented instantaneously.
2. Acceptable presentation unit boundaries may not be under the control of the encoder. For example if an MPEG decoder is locked to an external television synchronization signal, the presentation unit boundaries are controlled by the synchronization pulse generator, not the decoder itself. This creates error in the presentation time.
3. Presentation time stamps which have errors in them, due to channel errors, and may prevent a frame from being decoded indefinitely.
Any of these factors can produce a situation in which the decoder becomes out of synchronization with the intended decoding times of the presentation units, such that video and audio are decoded and presented too early or too late. This is especially undesirable in an application such as television in which the audio must be precisely synchronized with the video.
A prior art technique to re-synchronize the decoding and presentation of video and audio presentation units is to skip a presentation unit (frame) if the decoder is running behind, and to repeat a frame if the decoder is running ahead.
However, this technique can create significantly noticeable distortion in the form of discontinuities in video and audio presentation. In the audio data bitstream, an exemplary MPEG Layer II frame consists of 1,152 audio samples, and can include as much as 14,000 bits of data. Repeating or skipping an entire frame of audio data creates a discontinuity of approximately 0.07 seconds, which is very audible.
In addition, a buffer memory which is required to store several frames of data to be skipped or repeated must be relatively large, adding to the size, complexity and cost of the decoder.
For these reasons, there exists a need in the art for a method of synchronizing an MPEG audio decoder which does not introduce audible distortion into the presentation, and which does not require a large buffer memory.