Video data transmission is becoming increasingly important in business and home applications such as video storage and playback systems. Video images (pictures) of video data are represented by frames of luminance and chrominance picture signals. Since the frames contain large amounts of video data, image compression is used to increase transmission rates.
Static image compression such as JPEG removes redundant video data in the spatial domain. Moving image compression such as MPEG removes redundant video data in both the spatial and time domains by taking advantage of intra-frame and inter-frame correlation.
Intra-frame correlation reduces video data spatial redundancy by converting the video data from the time domain to the frequency domain using an orthogonal transform to generate orthogonal transform coefficients.
For example, an 8×8 pixel block with luminance and chrominance amplitudes at the respective pixels is converted by a discrete cosine transform into 8×8 discrete cosine transform (DCT) coefficients. The first DCT coefficient is a DC (zero frequency) coefficient and the remaining 63 DCT coefficients are AC coefficients with increasingly higher frequencies.
Inter-frame correlation uses predictive encoding between successive frames. Since a fairly small change between successive frames is typical, transmission of the frame differences generated by predictive encoding is usually more efficient than transmission of the frames. However, the frames cannot be restored if only the frame differences are transmitted. Therefore, the frames are occasionally transmitted without predictive encoding as a reference for the frame differences.
Pictures encoded with intra-frame correlation are referred to as intra-pictures or I-frames. Pictures encoded with predictive encoding relative to one preceding picture are referred to as predictive pictures or P-frames. Pictures encoded with predictive encoding relative to at most two pictures (either the following picture or both the preceding and following pictures) are referred to as bi-directionally predictive pictures or B-frames.
P-frames follow an I-frame or a P-frame. B-frames can predictively encode two I-frames, two P-frames or one of each using a reference picture based on the mean value of the two pictures. Picture groups include an I-frame and P-frames and B-frames derived from the I-frame.
For example, a picture group is provided by frames F1, F2, F3 . . . F17. The leading frame F1 is an I-frame, the second frame F2 is a B-frame, the third frame F3 is a P-frame, and the fourth and the following frames F4 to F17 are alternately B-frames and P-frames.
Video systems are increasingly expected to provide trick play operation, such as fast forward and fast reverse, in addition to normal play operation. During trick play operation, the video data is transmitted at a higher transmission rate than during normal play operation.
Video systems have been designed with higher transmission bandwidths to accommodate higher transmission rates during trick play operation, however this increases costs. Video systems have also been designed to drop some or all of the P-frames and B-frames during trick play operation and reconstruct the video data using the I-frames. However, the I-frames are considerably larger than the P-frame and B-frame counterparts. As a result, the decoder that performs the inverse discrete cosine transform on the MPEG data to generate the video data exhibits a processing bottleneck, thereby creating a low perceived motion rate of the video images. Video systems have also been designed to drop some of the I-frames during trick play operation, however this degrades the picture quality.
Therefore, there is a need for video data reduction in MPEG bit streams that preserves picture quality without increasing transmission bandwidth requirements and associated costs.