1. Field of the Invention
Embodiments as described herein relate to data communication. More particularly, these embodiments relate to digital video and audio processing and an improved system and method for the decoding of digital video and audio data.
2. Description of the Related Art
Digital multimedia data such as video and music can be transmitted to multiple receivers, such as wireless telephones or television, for playing of the multimedia by users of the receivers. The multimedia can be formatted in accordance with a number of different video encoding standards. The Moving Picture Experts Group (MPEG), for example, has developed a number of standards such as that described in MPEG-2 part 2. Other encoding standards include H.261/H.263 and the latest H.264/AVC.
Video encoding standards achieve increased transmission rates by encoding data in a compressed fashion. Compression can reduce the overall amount of data that needs to be transmitted for effective transmission of image frames. The H.264 standards, for example, utilize graphics and video compression techniques designed to facilitate video and image transmission over a narrower bandwidth than could be achieved without the compression. In particular, the MPEG standards incorporate video encoding techniques that utilize similarities between successive image frames, referred to as temporal or interframe correlation, to provide interframe compression. The interframe compression techniques exploit data redundancy across frames by converting pixel-based representations of image frames to motion representations. In addition, the video encoding techniques may utilize similarities within image frames, referred to as spatial or intraframe correlation, in order to achieve intra-frame compression in which the spatial correlation within an image frame can be further compressed. The intraframe compression is typically based upon conventional processes for compressing still images, such as spatial prediction and discrete cosine transform (DCT) encoding.
The MPEG compression technique incorporates video encoding techniques that utilize similarities between successive image frames, referred to as temporal or interframe correlation, to provide interframe compression. The interframe compression techniques exploit data redundancy across frames by converting pixel-based representations of image frames to motion representations. Encoding breaks each picture into blocks called “macro blocks”, and then searches neighboring pictures for similar blocks. If a match is found, instead of storing all of the entire block, the system stores a much smaller vector that describes the movement (or lack thereof) of the block between pictures. In this way, efficient compression is achieved.
To support these compression techniques, many digital video devices include an encoder for compressing digital video sequences, and a decoder for decompressing the digital video sequences. In many cases, the encoder and decoder comprise an integrated encoder/decoder (CODEC) that operates on blocks of pixels within frames that define the sequence of video images. For each macro block in the image frame, the encoder searches macroblocks of the immediately preceding video frame to identify the most similar macroblock, and encodes the difference between the macro blocks for transmission, along with a motion vector that indicates which macro block from the previous frame was used for encoding. The decoder of a receiving device receives the motion vector and encoded differences, and performs motion compensation to generate video sequences.
Proper conversion of the frames back into video and audio relies upon proper timing between the encoder at the transmitter and the decoder at the receiver. The receiver is typically provided with information relating to when a frame was encoded by the transmitter, with respect to the collection of frames, so that a frame can be properly synchronized and presented to the user at the receiver. MPEG2 TS is part of a standard describing a transport layer for the MPEG2 compression technique. MPEG2 TS accommodates the synchronization between encoder and decoder by recording the encoder's local clock time into the packets of compressed frames as timestamps. These timestamps are known as the Program Clock Reference, or PCRs (See ISO/IEC 13818-1, Annex D, incorporated in its entirety herein by reference). PCRs are thus a form of encoder timestamp. PCRs are introduced by the System Time Clock (STC) at the transmitter and form the master clock referenced by the decoder to determine when video and audio are to be decompressed and displayed.
The PCR value may not always be accurate, however. Imperfections in either the receiver clock or transmitter clock, resulting from operating conditions or inherent defects, will result in drift between the two clocks. The decoder at the receiver will eventually experience underrun if its clock is faster, or overrun, if its clock is slower than the clock at the transmitter. To prevent overrun and underrun, the receiver calculates the drift between its clock and the clock of the transmitter. The receiver then compensates for this drift when decoding subsequent frames.
In many devices, a hardware interface in the transport layer, known as the transport stream interface (TSIF), timestamps each packet containing a PCR value once it is received. This timestamp can then be used with the PCR contained in the packet to extrapolate the drift between the encoder and decoder clocks. However, some mobile device interfaces do not support TSIF. Without an alternative method for clock synchronization, these devices will be incompatible with the ISDB-T, DVB-T, T-DMB, and many other standards.