Set-top boxes receive media information from a source (such as a head-end distribution site), process the media information, and present the processed media information on an output device (such as a conventional television unit). In addition to rendering a conventional stream of media information (e.g., from a conventional broadcast), some set-top boxes also include provisions for presenting static images. For example, some set-top boxes can receive the static image on an in-band or out-of-band channel and store the image in memory (e.g., RAM memory). A decoder can take the image from memory and decode it for presentation on the television unit.
A better understanding of the nature of image processing in set-top box environments can be gained through the following introductory information regarding the nature of exemplary image coding standards. Consider, for example, the MPEG-2 standard (where MPEG is an acronym for “Moving Pictures Experts Group”), as fully described in the international standard ISO/IEC 13818-2. FIG. 1 presents salient features of the MPEG-2 standard. As shown there, the MPEG-2 standard defines image content in a hierarchy of units. The most encompassing such unit comprises a video sequence, demarcated by a sequence header and an end-of-sequence code. A sequence can include one or more groups of picture (GOPs). FIG. 1 shows one exemplary GOP 102. The GOP 102, in turn, can comprise one or more pictures. FIG. 2 shows a series of pictures, including exemplary picture 104. The picture 104, in turn, is composed of a plurality of slices. FIG. 1 shows an exemplary slice 104 within picture 104. A slice can span the entire length of the picture 104, or the entire length of a picture can comprise multiple slices in series. The use of slices accommodates efficient error processing. Namely, if a decoder encounters a slice containing an error, the decoder can limit the effects of the error to the slice. (However, each slice has a header, which adds overhead to the encoded image, so there is a tradeoff between increasing the number of slices to ensure satisfactory error-processing performance and decreasing the number of slices so as not to unduly burden the decoder with too much overhead data.) Finally, the slice 106 can comprise multiple macroblocks. FIG. 1 shows one exemplary macroblock 108. The macroblock 108 includes a 2×2 array of 8×8 image blocks. The image content in the pictures is processed using a combination of discrete cosine transform (DCT) processing, quantization, and run-length encoding.
The MPEG-2 standard includes different types of pictures, including an intra (I) picture, a predictive (P) picture, and a bi-directional (B) picture. I pictures correspond to image content with sufficient information to be decoded without reference to other neighboring pictures. P pictures contain information which allows a decoder to decode their content with reference to a previous picture in a stream of pictures. And B pictures contain information which allows a decoder to decode their content with reference to a previous picture or a subsequent picture. Thus, the B and P pictures generally comprise difference pictures, because they encode their content with reference to other pictures, by expressing how these B and P pictures differ from other pictures. FIG. 1 shows a stream of pictures in the GOP 102, including an exemplary I picture 110, an exemplary B picture 112, and an exemplary P picture 114.
Consider the exemplary case of the P picture 114. This picture can be coded with reference to a previous picture, such as the previous I picture 110 (constituting a reference picture). Namely, for a given macroblock under consideration, the encoder will attempt to determine if the image information in this macroblock is related to counterpart image information in the reference picture. If this is so, then the encoder can encode this macroblock by determining a motion vector (MV) which describes how the image information in the macroblock has moved relative to its counterpart information in the reference picture. The encoder can also encode the macroblock by determining difference information which describes how the image information in the macroblock has changed (in content) relative to its counterpart information in the reference picture. If the image information has not changed relative to the reference picture, the encoder can encode the macroblock using zero motion vector and zero difference. Alternatively, in some cases, the encoder can determine that a large portion (such as a slice) of the picture is unchanged. More specifically, if a slice contains two or more zero motion vector macroblocks, the encoder can encode the slice as containing “skipped macroblocks.” However, if the encoder cannot trace the image information in the macroblock under consideration to counterpart image information in the reference picture, then it will encode that macroblock as an intra (I) macroblock.
Upon receipt of the encoded content, the decoder will decode a P picture in a manner depending on how its constituent macroblocks have been encoded. For example, the decoder will decode a macroblock with non-zero MV and non-zero difference information by modifying the image content in a previous picture. The decoder will decode a macroblock containing zero MV and zero difference information by simply repeating the image content taken from a previous picture. In a similar manner, the decoder will decode a macroblock within a skipped macroblock region by simply repeating the image content taken from a previous picture. Finally, the decoder will decode an intra macroblock using only image information contained in the macroblock itself, that is, without reference to any previous picture.
With this introduction, a set-top box can receive MPEG pictures (or pictures coded accorded to some other standard) and present the pictures in conjunction with an application running on the set-top box. More specifically, an application can rely on the pictures as image resources, where different states in the execution of the application can call on different image resources. The set-top box functionality devoted to presenting video components is referred to as the video layer. The set-top box can also “overlay” graphics information on top of the video information (such as a cursor, various controls, etc.). The set-top box functionality devoted to presenting the graphics information is referred to as a graphics layer, also referred to as an on-screen display (OSD) layer.
However, according to one design challenge, bandwidth is typically a scarce commodity in set-top box environments. For example, an application which references a series of application pages containing different combinations of video components can simply request a new I picture when the user invokes each new application page. However, as appreciated by the present inventors, over-use of I pictures to transmit image information can overwhelm the limited bandwidth resources of the set-top box. This approach can also overwhelm the memory resources of the set-top box (and/or possibly other resources of the set-top box).
As another design challenge, many set-top boxes provide various constraints regarding the kinds of image content that can be received and processed. For example, some kinds of set-top boxes only accept I pictures and certain types of difference pictures. Thus, an effort to reduce bandwidth must also work within the technical constraints of specific set-top box environments.
There therefore exists a general need in the art to present image and video content in different set-top box environments in a resource-efficient manner.