Computing devices often utilize a graphics processing unit (GPU) to accelerate the rendering of graphics data for display. Such computing devices may include, e.g., computer workstations, mobile phones (e.g., so-called smartphones), embedded systems, personal computers, tablet computers, and video game consoles. GPUs typically implement a graphics processing pipeline that includes a plurality of processing stages which operate together to execute graphics processing commands. Traditionally, GPUs included a fixed-function graphics processing pipeline where each processing stage in the pipeline was implemented with fixed function hardware (e.g., hardware that is hard-wired to perform a certain set of specialized functions and not capable of executing a user-downloadable program).
More recently, graphics processing pipelines have shifted to a programmable architecture where one or more processing stages in the pipeline are programmable processing stages and are implemented with one or more programmable shader units. Each of the programmable shader units may be configured to execute a shader program. A user application may specify the shader program to be executed by the programmable processing stages in a programmable graphics pipeline, thereby providing a high degree of flexibility in the use of modern day GPUs.
As graphics processing technology develops, graphics processing pipelines are becoming more sophisticated and an increasing number of different types of programmable processing stages are being added to the standard graphics processing pipelines that are specified by the major graphics application programming interfaces (APIs). Implementing these different types of programmable processing stages with the limited resources in a GPU can present significant challenges.