1. Field of the Invention
The present invention is generally directed to computing operations performed in computer systems. More particularly, the present invention is directed to a processor such as, for example, a graphics processing unit (GPU), that performs computing operations and applications thereof.
2. Background Art
A GPU is a complex integrated circuit that is specially designed to perform data-parallel computing tasks, such as graphics-processing tasks. A GPU may, for example, execute graphics-processing tasks required by an end-user application, such as a video-game application. The GPU may be a discrete (i.e., separate) device and/or package or may be included in the same device and/or package as another processor (e.g., a CPU). For example, GPUs are frequently integrated into routing or bridge devices such as, for example, Northbridge devices.
Several layers of software exist between an end-user application and a GPU. The end-user application communicates with an application-programming interface (API). An API allows the end-user application to output graphics data and commands in a standardized format, rather than in a format that is dependent on the GPU. The API communicates with a driver. The driver translates standard code received from the API into a native format of instructions understood by the GPU. The driver is typically written by the manufacturer of the GPU. The GPU then executes the instructions received from the driver.
To complete a graphics-processing task, a GPU typically executes a plurality of shader programs (“shaders”), including a vertex shader, a geometry shader, and a pixel shader. In the past, a GPU may have included a plurality of engines, wherein each engine was configured to implement one of the shaders. More recently, APIs have moved to a unified shader model in which a single processing engine (“shader core”) of a GPU implements each of the shader programs. Unfortunately, conventional GPUs may not be configured to efficiently implement a unified shader model from a hardware perspective.
What is needed, therefore, is a GPU that efficiently implements the unified shader model from a hardware perspective.