Digital video technology is used in a growing number of applications such as cable television, direct broadcast satellite or other direct to home satellite services, terrestrial digital television services including high-definition television, and the like. Digital representations of video signals often require a very large number of bits. As such, a number of systems and methods are currently being developed to accommodate transmission and storage of still images and video sequences using various types of compression technology implemented in both hardware and software.
The availability of economically feasible and increasingly more powerful microprocessors allows integration of natural and synthetic audio and video sequences. Information in the form of audio and video sequences may be integrated to present real-time and non-real-time information in a single sequence. To provide audio and video sequences having acceptable quality at a minimum cost requires having the greatest efficiency possible in the decoding mechanism so as to require the least amount of memory and processing resources.
Decoding efficiency can be expressed as the ratio of resources used to generate a frame to total resources in use. For memory, this is the amount of storage holding data for the displayed portions of sprites in proportion to the total storage required to hold all sprite data. For CPUs, this is the number of machine cycles used to transform the data for the displayed portions of sprites in proportion to the total number of cycles used to transform all sprite data.
An audio/visual (AV) object may be used to represent a physical (real) or virtual article or scene. AV objects may be defined in terms of other AV objects which are referred to as sub-objects. An AV object which is not a composite or a compound AV object is referred to as a primitive. A sprite or basis object is an AV object created within a block of pixels that can be manipulated as a unit using geometrical transformations. Rather than re-transmitting and re-displaying the sprite object, new transformation parameters are provided to generate subsequent video frames. This results in a significant reduction in the amount of data necessary to represent such frames.
A small sprite object may represent a character in a video game whereas a large sprite object may represent an image which is larger than an individual frame and may span a number of frames. For example, a still image of a video layer of a scene, such as the background of a room, may be represented by a large sprite object (basis object). A particular video sequence in which a camera pans across the room would have a number of frames to depict motion of the camera. Rather than transmitting a still image for each frame, only the transformation parameters are required to manipulate a portion of the sprite object which is reused multiple times as the video frames are generated.
Transmission of a sprite image requires either that the entire sprite is encoded and transmitted prior to its use in the video sequence or that the sprite is transmitted piece by piece as additional portions of the image are required for display. Then the image at the decoder is transformed to its correct representation at each instance of time prior to its display. The larger the sprite image, the larger the required decoder memory and the greater the required CPU time necessary to transform the image to its correct representative view at each time instance (frame).
Prior art implementations do not specify a mechanism for signaling the decoder that portions of the sprite, which may have been necessary at some point in the video sequence, are no longer needed. The entire sprite is held in decoder memory until the entire sprite is no longer needed. This leads to much larger decoder memory and computational ability requirements than necessary for many video sequences utilizing sprite technology.