1. Field of the Invention
The present invention relates to computer graphics technology.
2. Related Art
Among the many functions that can be performed on personal and workstation computers, the rendering of images has become one of the most highly valued applications. The ever advancing demand for increasingly sophisticated image rendering capabilities has pulled the development of both hardware and software technologies towards meeting this end. Indeed, computer graphics applications have facilitated the introduction of multiprocessors into the designs of personal and workstation computers. Today, many personal and workstation computers include, in addition to a central processing unit, one or more “graphics controllers” dedicated to processing graphics data and rendering images.
To increase rendering speed, computer graphics processes have been decomposed into standard functions performed in sequential stages of a “graphics pipeline”. At least one “graphics processing unit” (“GPU”) operates on each stage. As each stage completes its specific function, the results are passed along to the next stage in the graphics pipeline. Meanwhile, the output of a prior stage (relating to the next frame in the sequence) is received. In this manner, the rendering speed of the overall process is increased to equal the processing speed of the slowest stage. Stages can be implemented using hardware, software, or a combination thereof.
Generally speaking, a computer graphics pipeline typically includes, in sequential order, a geometry stage and a rasterizer stage. An application passes graphics data to a computer graphics pipeline. For example, an application may determine the image to be rendered and model the three-dimensional curvilinear form of each object in the image as a three-dimensional assembly of interconnected two-dimensional polygons (called “primitives”) that approximates the shape of the object. Each polygon is defined by a set of coordinates and an outwardly pointing vector normal to the plane of the polygon.
The geometry stage acts on the graphics data it receives from the application. The geometry stage often is further decomposed into more functional stages, each of which can have an associated processor to perform operations. For example, these stages can include, but are not limited to, a model and view transform stage, a light and shading stage, a projection stage, a clipping stage, a screen mapping stage, and others. The rasterizer stage uses the results of the geometry stage(s) to control the assignment of colors to pixels as the image is rendered.
As computer graphics has matured as a technology, standards have been created to coordinate paths of development, to ensure compatibility among systems, and to reduce the amount of investment capital necessary to further the state of the art. These standards allow designers a fair degree of leeway in choosing between hardware and software technologies to perform specific functions. For a given hardware architecture, much of the current efforts in developing computer graphics centers on means to optimize the processing power of the given architecture.
The use of multiple GPUs in computer graphics hardware not only enables stages in a graphics pipeline to be processed simultaneously, but also allows for additional graphics pipelines for parallel processing. With parallel processing, graphics pipelines can be assigned to different images. Using this architecture, the different images can be combined, by “compositing”, to be presented for final viewing.
A “compositor” is a component that performs compositing and is often implemented in hardware. Within a compositor is a device capable of receiving input data and outputting all or part of the data as an image. The portion of the data presented for viewing is designated as the “display area”.
Communications to a compositor can occur through a variety of means. To facilitate the use of high performance digital displays, a “Digital Visual Interface” (“DVI”) standard has been developed to establish a protocol for communications between central processing units and peripheral graphics chips. DVI is an open industry standard designed to enable high performance digital displays while still supporting legacy analog technology. DVI uses both “Transitional Minimized Differential Signal” (“TMDS”) data links and “Inter Integrated Circuit” (“I2C”) busses. TMDS data links use a technique that produces a transition controlled DC balanced series of characters from an input sequence of data bytes. Bits in a long string of 1s or 0s are selectively inverted in order to keep the DC voltage level of the signal centered around a threshold that determines whether the received data bit is a 1 voltage level or a 0 voltage level. I2C busses provide two-wire communication links between integrated circuits.
Compositing can be accomplished through several different methods. Where frames are presented in a dynamic sequence, “temporal compositing” can be performed by using each graphics pipeline to process a succeeding frame. Alternatively, “spatial compositing” can be performed by using each graphics pipeline to render a portion of each overall frame and combining the output of each graphics pipeline spatially with respect to the location of the rendered portion within the overall frame. In temporal compositing, where the computer graphics hardware has “n” graphics pipelines, each graphics pipeline processes every nth frame in a sequence of frames. Each graphics pipeline renders all of the objects and the background in a single frame. Often the outputs of the graphics pipelines are multiplexed together further to increase the speed at which a sequence of frames is rendered.
However, for a given number of graphics pipelines, optimal temporal compositing depends on the relationship between the rendering speed of a given graphics pipeline and the rate at which image outputs can be combined. Adding features to an image to improve its quality can also increase the “complexity” of the data to be rendered and reduce the speed at which a frame is rendered by a graphics pipeline. This, in turn, can lower the rate at which image outputs are composited.
Another problem posed by a composition process in the time domain arises when the rendered images reside in an interactive environment. In an interactive environment, a user viewing a sequence of frames of images is permitted to supply a feedback signal to the system. This feedback signal can change the images that are rendered. In a time domain composition system, there can be a noticeable delay between the time at which the user provides the feedback signal and the time at which the system responds to it. The user supplies the feedback signal at a particular frame to one of the graphics pipelines in the system. Because the other graphics pipelines are already in the process of rendering their pre-feedback frames, the system typically imposes a time delay to allow the other graphics pipelines to complete their rendering of these frames before acting on the feedback signal.
In contrast, in spatial compositing, where the computer graphics hardware has n graphics pipelines, each graphics pipeline renders one of n subsets of the pixels of each frame. Each subset is combined, by compositing, to be presented for final viewing. By reducing the amount of graphics data that each graphics pipeline must act on, spatial compositing can increase the rate at which an overall frame is rendered.
In spatial compositing, a “compositing window” is located within all or a part of the display area. The compositing window is divided, or decomposed, into non-overlapping portions called “tiles”. Each tile receives the output of an assigned graphics pipeline to effect spatial compositing. The shape and size of the compositing window and the shape, size, and position of each of the tiles can be defined by parameters that characterize the two-dimensional contours of the compositing window and tiles. Parameters can include, but are not limited to, coordinate points for corners, centers, or focal points; lengths of radii; interior angles; and degrees of curvature.
Whereas with temporal compositing, heavy loading of a graphics pipeline processor reduces the rate at which frames are rendered, with spatial compositing this rate is increased to that of the slowest graphics pipeline. Therefore, optimization depends on the ability of the system to balance the processing load among the different graphics pipelines. The processing load typically is a direct function of the size of a given tile and an inverse function of the rendering complexity for objects within this tile. Thus, often an application will vary the sizes of the different tiles within the compositing window in order to balance the processing load among the graphics pipelines for the rendering of a given frame.
However, the cost of this flexibility is that it can be necessary to communicate the number, sizes, and positions of tiles being used for that given frame. This can add substantially to the overhead information that must be communicated for spatially composited images. This situation compounds an already difficult problem as advancements in memory capacities and processor speeds have outstripped improvements in interconnect bus throughputs. To minimize the extent to which data links become bottlenecks, what is needed is an efficient technique to identify individual compositors within a compositor tree and to detect the structure of the compositor tree so that an application can determine a desired tiling configuration that exploits the structure of the compositor tree.