With the increasing popularity of DVDs and the advent of video-on-demand (VOD) systems, the videos (e.g., moving images) being viewed in homes and businesses are digital. The predominant video compression and transmission formats are from a family called MPEG (Moving Picture Experts Group). It is the name of family of standards used for coding audio-visual information (e.g., movies, video, music, and such) in a digital compressed format.
MPEG
Generally, a MPEG video stream is composed of three types of frames:                intra frames (I-frames),        predictive frames (P-frames); and        bi-directionally predictive frames (B-frames).        
An MPEG video stream is typically defined by segments called Groups of Pictures (GOP). Typically, a GOP consists of a set of pictures of ½ second duration when displayed at their intended speed.
FIG. 1 illustrates a typical MPEG GOP 100. This example includes an I-frame; two P-frames; and nine B-frames. Typically, each GOP includes consecutive frames beginning or ending with an I-frame (such as frame 110).
Decoding typically begins at the chronographic start of any GOP, essentially independent of any preceding GOPs. That is I-frame 110 in the GOP 100 of FIG. 1. There is no specific limit to the number of pictures which may be in a GOP, nor is there a requirement for an equal number of pictures in all GOPs in a video sequence.
I-frames and P-frames are called “anchor” frames (of “key” frames). An I-frame can be decoded independently of any other frames. It does not rely on data from any other frame to construct its image. A P-frame (such as frame 120) requires data from a previously decompressed anchor frame (e.g., I-frames or P-frames) to enable its decompression. While it is dependent, it is only dependent of an anchor frame that has already been decoded.
A B-frame (such as frame 132) requires data from both preceding and succeeding anchor frames (e.g., I-frames or P-frames) to decode its image. It is bi-directionally dependent.
In FIG. 1, the ends of the arrows indicate the frame(s) from which the arrow-pointed frame is dependent. For example, B-frame 142 is dependent upon P-frame 120 and P-frame 122.
Conventional “Trick Play” Technology
Scanning is when a user views a video (either forward or backwards) at a rate of speed other than the intended or specified. The most common example of scanning is fast-forward-play (FF-play) and rewind-play (RW-play). Scanning is also called “trick play” mode.
The conventional technique to scan digital video, specifically MPEG, is to drop all of the dependent frames as the video is scanned (forward or backwards).
With conventional scanning of digital video (namely, MPEG), only I-frames are displayed. Thus, the B-frames and P-frames are skipped while scanning. This is a fairly straightforward task. Since the I-frames do not depend upon any other frames, they may be simply plucked from the Group-of-Pictures (GOP) and displayed without any additional frame inter-dependency processing.
FIG. 2 illustrates the presentation of only the I-frames of GOPs. In this example, I-frames 210, 220, 230, and 240 are shown. No B-frames or P-frames are shown.
However, this conventional solution produces a poor moving-picture quality because of several factors, such as uneven I-frames spacing. The moving images appear jerky, jumpy, shuddering, and erratic.
Conventional MPEG Scanning Solution
Using only this conventional approach, the minimum scan-rate is quite fast. It may be 10-15 times the normal play rate.
For example, if there are 12 frames in a GOP (like that of FIG. 1), the slowest scan-rate that can be achieved via these traditional techniques is 12-15 times the normal speed. When all of the dependent frames (B-frames and P-frames) are dropped, that leaves about one I-frames per 12-frame GOP.
However, there are other conventional techniques that approximate slower scan-rates—that is, slower than 10-15 times the normal speed. This is achieved by holding I-frames longer. However, doing this only accentuates the jerkiness of the scan. The scan appears even more jerky, jumpy, shuddering, and erratic.
Accordingly, it is a challenge to present a scan of digital video that does not appear jerky, jumpy, shuddering, or erratic.