The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
The Motion Pictures Experts Group (MPEG) specifies several standards for encoding video streams. The MPEG standards specify that an encoded video stream may contain multiple frames. An encoded video stream may be “interlaced” or “progressive.” If an encoded video stream is interlaced, then each frame in the video stream includes two fields. The “top” field of an interlaced frame represents the odd-numbered horizontal lines of pixels in the frame, while the “bottom” field of an interlaced frame represents the even-numbered horizontal lines of pixels in the frame. As used herein, a “picture” is a representation of and encodes either a frame (in the case of progressive video streams) or a field (in the case of interlaced video streams). A picture that encodes a frame is called a “frame picture.” A picture that encodes a single field is called a “field picture.”
In addition to being either a frame picture or a field picture, a given picture may, additionally, be an intra-coded picture (an “I-picture”), a predictive-coded picture (a “P-picture”), or a bidirectionally-predictive-coded picture (a “B-picture”). I-pictures independently represent a complete frame or field within the video stream; data from no other picture in the video stream is needed in order to decode and present the frame or field that an I-picture represents. In contrast, P-pictures and B-pictures do not independently represent a complete frame or field within a video stream. P-pictures and B-pictures rely on data that is encoded by one or more other pictures in the video stream (in addition to the data that is encoded by those P-pictures and B-pictures themselves) in order to fully represent a complete frame or field within the video stream. More specifically, subcomponents (“blocks”) of P-pictures and B-pictures refer to other pictures in a video stream.
Each picture in an MPEG-encoded video stream is subdivided into “macroblocks.” Each “macroblock” is a set of 256 pixels that is 16 pixels high and 16 pixels wide. Each macroblock is further subdivided into “blocks.” A “block” is a set of pixels. The size of a block in pixels may vary depending on the particular MPEG standard being used to encode a video stream.
In an MPEG-encoded video stream, pictures occur in “decode order” (the order in which those pictures will be decoded) rather than in “presentation order” (the order in which the content that those pictures represent will be presented). Because a particular picture cannot be completely decoded until all of the other pictures to which the particular picture's blocks refer have been decoded, such a particular picture is placed later in the decode-ordered MPEG-encoded video stream than such other pictures are at encoding time. As a result, at the time that the particular picture is decoded, the other pictures to which the particular picture's blocks refer will already have been decoded.
I-pictures and P-pictures are called “reference pictures” because the blocks of other pictures can refer to them. According to some encoding standards, B-pictures are not reference pictures because the blocks of other pictures do not refer to B-pictures under those standards. Blocks in a P-picture may refer back to a preceding (referring to presentation order) reference picture in the video stream. Blocks in a B-picture may refer to a pair of other pictures in the video stream. Such a pair includes a preceding (referring to presentation order) reference picture in the video stream and a following (referring to presentation order) reference picture in the video stream. Blocks in an I-picture do not refer to any other pictures in a video stream.
The MPEG-2 standard adheres to some specified restrictions with regard to which other pictures the blocks of a particular picture can refer. The MPEG-2 standard requires that the picture to which a P-picture's blocks refer be the same picture for all of the P-picture's blocks that refer to another picture; according to the MPEG-2 standard, different blocks of the same P-picture are not permitted to refer to different pictures in the video stream. Similarly, the MPEG-2 standard requires that the pair of pictures to which a B-picture's blocks refer be the same pair of pictures for all of the B-picture's blocks that refer to a pair of pictures; according to the MPEG-2 standard, different blocks of the same B-picture are not permitted to refer to different pairs of pictures in the video stream. The VC-1 encoding standard also adheres to the foregoing restrictions. In contrast, the MPEG-4 standard is not similarly restricted; different blocks of a given picture in an MPEG-4 encoded video stream may refer to different pictures (in the case of P-pictures) or different pairs of pictures (in the case of B-pictures) in the video stream.
Additionally, the MPEG-2 standard specifies that only the two most recently decoded frames of reference pictures be retained in a frame cache so that blocks of other pictures can refer to those decoded frames. Whenever a new frame of a reference picture is encountered in an MPEG-2 encoded video stream, if there are already two decoded frames in the frame cache, then one of the decoded frames is evicted from the frame cache to make room for the new frame. This imposes a limitation on the set of other frames to which blocks in an MPEG-2 encoded stream can refer. The VC-1 encoding standard also possesses the foregoing limitations. In contrast, under the MPEG-4 standard, 16 decoded frames of reference pictures (or, 32 decoded fields of reference pictures) may be retained in a frame cache so that blocks of other pictures can refer to those decoded frames. Thus, the set of other frames to which blocks can refer is much less limited under the MPEG-4 standard.
Additionally, under the MPEG-2 standard, whenever a frame needs to be evicted from the frame cache as discussed above, the least recently decoded frame is selected for eviction. In contrast, under the MPEG-4 standard, whenever a frame needs to be evicted from the frame cache, any specified one of the frames in the frame cache may be selected for eviction, regardless of how recently the specified frame was decoded.
Digital video recorder (DVR) functions include playback, random access, and “trick play” of content. Trick play functions include display pause, fast-forward, and rewind performed at various frame rates or display speeds. Despite the differences in MPEG-2 and other more advanced standards (e.g., VC1 (SMPTE-421M) and AVC (MPEG-4 Part 10, or H.264)), commercially available DVRs often handle trick play functionality as though those DVRs had to operate under at least some of the constraints of the older MPEG-2 standard. As a result, commercially available DVRs are providing, to their users, a trick play experience that is relatively unsophisticated and crude. Conventional approaches for performing trick play functions in a DVR typically use a large amount of resources—including processor resources, memory, and/or disk space—or provide a poor viewing experience, characterized by imprecise repositioning inside the stream, a low number of frames per second, etc. There is a need for an approach to provide trick play functions in a DVR, with an advanced codec or a conventional codec, in a way that consumes a limited amount of extra resources beyond those required for regular playback, while simultaneously providing a high quality viewer experience.