A graphics processing unit (GPU) is a processor chip that is dedicated to performing the calculations necessary to render graphics objects on a computer display. The GPU may be a dedicated device, several devices or integrated into a larger device (e.g., a north bridge device or a CPU). A common workflow inside a GPU involves updating the values of constants in a memory array and then performing a draw operation using the constants as data. A GPU whose memory array contains a given set of constants may be considered to be in a particular “state”.
In graphics processing chips it is common to set up the state of the chip, perform a draw operation, and then make only a small number of changes to the state before the next draw operation. Most of the state settings, e.g. values of constants in memory, remain the same from one draw operation to the next.
In a typical GPU there is long latency in the graphics pipeline. A draw operation must wait many processor clock cycles for data to be fetched from memory. It is inefficient to leave the processor idle during this time. A better use of processor resources is to have several draw operations in process at the same time, each operating on its own state setting. Potentially dozens, or even hundreds, of draw operations, each needing its own state setting, might be running in a GPU at any given time.
One way to enable simultaneous processing of multiple draw operations is to provide multiple copies of all state registers. That way each draw operation can operate on its own copy of the chip state without waiting for earlier operations to finish. This solution is expensive in terms of chip real estate, however. The die size increases quickly as more and more copies of the memory are required. Updating the data within all the copies is also time consuming.
What are needed are systems and methods for efficiently managing incremental state updates in a processor.