1. Field of the Invention
The present invention is generally directed to computing operations performed in computer systems.
2. Background Art
A graphics processing unit (GPU) is a complex integrated circuit that is specially designed to perform graphics processing tasks. A GPU may, for example, execute graphics processing tasks required by an end-user application, such as a video game application. In such an example, there are several layers of software between the end-user application and the GPU. The end-user application communicates with an application programming interface (API). An API allows the end-user application to output graphics data and commands in a standardized format, rather than in a format that is dependent on the GPU. Several types of APIs are commercially available, including DirectX® developed by Microsoft Corp. and OpenGL® developed by Silicon Graphics, Inc. The API communicates with a driver. The driver translates standard code received from the API into a native format of instructions understood by the GPU. The driver is typically written by the manufacturer of the GPU. The GPU then executes the instructions from the driver.
Many GPUs use a technique known as pipelining to execute the instructions. Pipelining enables a GPU to work on different steps of an instruction at the same time, and thereby take advantage of parallelism that exists among the steps needed to execute the instruction. As a result, a GPU can execute more instructions in a shorter period of time. The video data output by the graphics pipeline are dependent on state packages—i.e., context-specific constants (such as texture handles, shader constants, transform matrices, etc.) that are locally stored by the graphics pipeline. Because the context-specific constants are locally maintained, they can be quickly accessed by the graphics pipeline.
The number of state packages maintained by the graphics pipeline depends on the API to which the GPU is coupled. The state packages associated with conventional APIs can be stored in a relatively small number of registers, such as eight registers. Unlike conventional APIs, newer APIs, such as DirectX® 10, require a relatively large number of frequent context switches with respect to certain aspects of the pipeline. The number of state packages associated with these frequent context switches cannot be supported by the relatively small number of registers maintained by conventional graphics pipelines.
An obvious solution for handling the larger number of state packages associated with newer APIs is to simply increase the number of state packages supported by the graphics pipeline. However, this solution would significantly increase die area because additional registers would be required to handle the additional state packages. In addition, this solution could create timing issues because the graphics pipeline would stall if the number of state packages exceeds the storage capacity of the pipeline. Another obvious solution would be to attempt to compensate for the increased number of state packages using software. For example, the driver or the end-user application could attempt to re-order work sent to the GPU to reduce the number of state changes (increase work sent per state change). This solution, however, has at least two drawbacks. First, this solution will only work with some workloads (some inherently have too many state changes). Second, it significantly increases the workload of the CPU to search and sort input transactions.
Given the foregoing, what is needed is a system, and applications thereof, that efficiently handle extra contexts for shader constants.