1. Field of the Invention
The present invention relates generally to the field of graphics processing and more specifically to a system and method for temporal load balancing across graphics processing units (GPUs).
2. Description of the Related Art
A typical computing system includes a central processing unit (CPU), an input device, a system memory, one or more graphics processing units (GPUs), and one or more display devices. A variety of software application programs may run on the computing system. The CPU usually executes the overall structure of the software application program and configures the GPUs to perform specific tasks in the graphics pipeline. Some computing systems include both an integrated (IGPU) and a higher-performance discrete GPU (DGPU). Such a computing system may support a hybrid performance mode in which the IGPU is configured to supplement the performance of the DGPU, thereby increasing the efficiency of the graphics pipeline.
In one approach to implementing a hybrid performance mode, the IGPU runs one image frame ahead of the DGPU, rendering only depth of field values (ignoring all color information) to establish the closest surfaces to the viewer. While rendering, the IGPU maintains the minimum Z-value, which corresponds to the closest depth of field value, for each pixel in the image frame using a two-dimensional array known as a Z-buffer. The DGPU then renders the image frame with full shading (including color information) using the pre-computed Z-values to avoid rendering certain pixel fragments (i.e., the fragment of each pixel intersected by an object) in the image that would otherwise be occluded by closer geometries in the image being rendered. Ignoring the color information allows the IGPU to efficiently pre-compute Z-values, while starting with the pre-computed Z-values allows the DPGU to avoid unnecessary shading operations.
One drawback to this approach is that, since the DGPU uses the pre-computed Z-values that are generated by the IGPU, the IGPU must first be finished with a particular frame before the DGPU can begin processing that frame. In some computing systems, the IGPU is substantially less powerful than the DGPU. Consequently, the IGPU may take longer to pre-compute Z-values for a given frame than the DGPU takes to finish rendering the previous frame. In such situations, the IGPU may become a bottleneck in the graphics pipeline, thereby hindering overall system performance.
As the foregoing illustrates, what is needed in the art is a more reliable technique for using the IGPU to pre-compute Z-values for the DGPU that can be used to avoid the bottleneck problem set forth above.