Conventionally, the creation and animation of computer generated graphic images employ the use of special purpose hardware and software. The display of a realistic looking changing scene, such as a scene of walking figures, requires the computation of changing lighting and visibility factors as the figures pass through shadows, or pass in front of and behind each other. The textured surfaces forming the image, such as the figures' clothing, must be rendered in a realistic fashion, and these renderings are dependent upon the changing lighting and visibility factors. Algorithms and techniques have been developed to optimize the creation and processing of graphics images, and special purpose hardware and software devices and systems have been developed to efficiently and effectively utilize these algorithms and techniques.
The algorithms and techniques used for graphics image processing are, in general, picture-element (pixel) based. A composite pixel value, such as an RGB composite, is associated with each pixel forming the image. A pixel's composite value includes sufficient information to display the pixel without regard to other parameters, or other pixel values. An RGB composite, for example, contains an indication of the amount of Red, Green, and Blue color components that are to be contained in the displayed pixel. This is sufficient information to display each particular pixel. In some systems, a fourth parameter, A, is included in an ARGB composite, where A contains an indication of the translucency of the colors forming the pixel.
Video image processing, for example, the display of a motion picture recorded on a video disc, is, in general, block based, using discrete blocks of data to represent each component's values (such as the luminance component, or the chrominance component) within a frame or portion of a frame. Each frame of an image is encoded in a format optimized for transmission or storage, such as an MPEG (Motion Picture Expert Group) stream. These frames can be further split into fields by separating even and odd lines. For further discussion the term frame includes either full frames or fields. Sequential scenes in a motion picture are efficiently encoded by encoding a reference frame, and then merely encoding the differences from one frame to the next. Additional efficiencies are gained by encoding the differences from one frame to the next as a set of movements of macroblocks of image information in the reference frame. That is, each frame is partitioned into macroblocks, and each subsequent frame is encoded as a relocation of an arbitrary group of pixels that are the size of the macroblock, with interpolation to allow for relocation at a sub-pixel resolution. The relocated macroblock is termed a predicted macroblock. The change of a macroblock's location from the reference frame to the subsequent frame is termed a motion vector, because the change is typically caused by the motion of an object in the changing scenes. Further efficiencies can be provided by encoding a reference frame and a future, or predictor, frame, and encoding intermediate frames as movements of macroblocks from either the reference or the predictor frames, or a combination (average) of these two frames, or from within the same frame. For ease of understanding, the terms reference frame and reference macroblock are used herein to mean any frame or macroblock to which motion vectors are applied to create other frames or macroblocks. In addition to the motion vectors, each subsequent frame includes a set of error terms that describe the difference between the predicted macroblock and the actual image being encoded. A number of transformations are applied to the error terms to minimize the time and bandwidth required to communicate the error terms. These transformations are well known to one of ordinary skill in the art and are not presented herein. The encoding of MPEG frames via these transformation is not necessarily loss free, and therefore the received error terms are an approximation. The reconstruction of a frame image by applying motion vectors and error terms to reference frames is termed motion compensation. It is estimated that motion compensation accounts for more than 30% of the processing of MPEG streams.
Video images are typically encoded using luminance (Y) information (the brightness), and chrominance (U,V) information (the redness and blueness). The human visual system is more sensitive to a change of brightness than to a change of color. Therefore, the encoding of video images includes the encoding of luminosity changes at a higher rate or higher resolution than chrominance changes. The luminance information and chrominance information are each encoded separately and distinctly, to allow for this difference in rate or resolution, and also to optimize the motion vector and error terms encoding, because the luminance of a scene may change without a corresponding change in color, and vice versa. The three components Y, U, and V are treated as separate image planes and encoded separately. To account for the different spatial resolution of the pixel components, each macroblock encoding contains a different number of samples for each component as defined by the video standard being used. Each of these separate image planes is partitioned into macroblocks. The MPEG stream contains macroblock encodings of each of the three separate components Y, U, and V, rather than a composite of the Y, U, and V components forming each pixel or macroblock. For the group of components Y, U, and V, there is one set of motion vectors describing a common source of prediction.
Increasingly, the same video processing system is being called upon to perform both video and graphics processing. Computers of today are expected to be able to display motion pictures; and televisions of tomorrow will be expected to provide animated graphic imaging. Because the encoding of video images utilizes discrete blocks of luminance and chrominance component data, and the encoding of graphic images utilizes pixel based composite values, systems that support both video and graphics image processing conventionally employ separate processing techniques and devices for each. The need for separate processing also incurs additional secondary requirements for memory and interface devices coupled to the separate processing devices. As the demands for image processing increase, and the functionality expected of a video processing system increases, each of these separate processing techniques and devices can be expected to become increasingly more complex and increasingly more expensive.
Therefore, a need exists for integrating the processing of both video and graphics data, and in particular the integration of video motion compensation and graphics processing, to optimize the use of available resources, and to minimize the cost of video processing systems.