A compression standard referred to as MPEG (Moving Pictures Experts Group) compression is a set of methods for compression and decompression of full motion video images which uses the interframe compression technique described above. MPEG compression uses both motion compensation and discrete cosine transform (DCT) processes and can yield compression ratios of more than 200:1.
The MPEG standard requires that sound be recorded simultaneously with the video data, and the video and audio data are interleaved in a single file to attempt to maintain the video and audio synchronized during playback. The audio data is typically compressed as well, and the MPEG standard specifies an audio compression method such as MPEG Layer II, also known by the Philips trade name of "MUSICAM".
An MPEG stream includes three types of pictures, referred to as the Intra (I) frame, the Predicted (P) frame, and the Bi-directional Interpolated (B) frame. The I or Intra frames contain the video data for the entire frame of video and are typically placed every 10 to 15 frames. Intra frames provide entry points into the file for random access, and are generally only moderately compressed. Predicted frames are encoded with reference to a past frame, i.e., a prior Intra frame or Predicted frame. Thus P frames only include changes relative to prior I or P frames. In general, Predicted frames receive a fairly high amount of compression and are used as references for future Predicted frames. Thus, both I and P frames are used as references for subsequent frames. Bi-directional pictures include the greatest amount of compression and require both a past and a future reference in order to be encoded. Bi-directional frames are not used for references for other frames.
After the I frames have been created, the MPEG encoder divides each I frame into a grid of 16.times.16 pixel squares called macro blocks. The respective I frame is divided into macro blocks in order to perform motion compensation. Each of the subsequent pictures after the I frame are also divided into these same macro blocks. The encoder then searches for an exact, or near exact, match between the reference picture macro block and those in succeeding pictures. When a match is found, the encoder transmits a vector movement code or motion vector. The vector movement code or motion vector only includes information on the difference between the reference frame and the respective succeeding picture. The blocks in succeeding pictures that have no change relative to the block in the reference picture or frame are ignored. In general, for the frame(s) following a reference frame, i.e., P and B frames that follow a reference I or P frame, only small portions of these frames are different from the corresponding portions of the respective reference frame. Thus, for these frames, only the differences are captured, compressed and stored. Thus the amount of data that is actually stored for these frames is significantly reduced.
After motion vectors have been generated, the encoder then tracks the changes using spatial redundancy. Thus, after finding the changes in location of the macro blocks, the MPEG algorithm further reduces the data by describing the difference between corresponding macro blocks. This is accomplished through a math process referred to as the discrete cosine transform or DCT. This process divides the macro block into four sub blocks, seeking out changes in color and brightness. Human perception is more sensitive to brightness changes than color changes. Thus the MPEG algorithm devotes more effort to reducing color space rather than brightness.
Each picture or frame also includes a picture header which identifies the frame and includes information for that frame. The MPEG standard also includes sequence headers which identify the start of a video sequence. Sequence headers are only required once before the beginning of a video sequence. However, the MPEG-2 standard allows a sequence header to be transferred before any I frame or P frame. The sequence header includes information relevant to the video sequence, including the frame rate and picture size, among other information.
MPEG video streams used in digital television applications generally include a sequence header before every I frame and P frame. This is necessary to facilitate channel surfing between different video channels, which is an important user requirement. In general, when a user switches to a new channel, the video for the new channel can not be displayed until the next sequence header appears in the stream. This is because the sequence header includes important information about the video sequence which is required by the decoder before the sequence can be displayed. If a sequence header were not included before each I frame and/or P frame, then when the user switched to a new channel, the video for the new channel possibly could not be immediately displayed, i.e., the video could not be displayed until the next sequence header.
The sequence headers in an MPEG encoded stream include presentation timestamps which are used for providing a timestamp or time base within the encoded stream.