Graphics processing typically involves coordination of two processors, a central processing unit (CPU) and a graphics processing unit (GPU). The GPU is a specialized electronic circuit designed to accelerate the creation of images in a frame buffer intended for output to a display. GPUs are used in embedded systems, mobile phones, personal computers, workstations, and game consoles. A GPU is typically designed to be efficient at manipulating computer graphics. GPU's often have a highly parallel processing architecture that makes the GPU more effective than a general-purpose CPU for algorithms where processing of large blocks of data is done in parallel.
The CPU may send commands to the GPU to implement a particular graphics processing task, e.g. render a particular texture that has changed with respect to a previous frame in an image. These commands are commonly referred to as “draw calls”, and there may be hundreds or thousands of draw calls in any particular frame.
In conventional setups, in order to implement each draw call the CPU has to perform certain setup work for GPU programs, known as shaders, to run. This typically includes setting up resources for the shaders to use in the form of buffers of data as well as uniform constants that may change between draw calls but are uniform for any particular draw call. Such resources may include texture bitmaps, pointers to texture bitmaps, samplers, and constants such as collections of floating point or integer values, and the like. These resources may be stored in a table, sometimes called a resource table. A graphics application program interface (API) implemented by the CPU may assign slots in a ring buffer for allocation of resources from the resource table to shaders that run on the GPU. A software component run by the CPU, sometimes referred to as a constant update engine (CUE), allocates the slots in the buffer and maintains the data for use by the shader in the resource table. This is a complicated process that has lots of overhead.
Anytime even a single entry in the resource table changes (e.g., one texture changes with respect to a previous draw call for the frame) the whole resource table is copied by the CPU. Each draw call needs its own resource tables. If the resource table is wholly the same between draw calls (i.e., nothing has changed for the draw), then the data can be reused. However, since the data is explicitly laid out as an entire table of data, if only one value needs to be changed in that data, then a new table must be copied with that change. Moreover, the CUE cannot simply change the value in the previous table because draw calls are not issued one at a time, but are rather batched together and kicked off at the same time. All simultaneously kicked draw calls must therefore have their own set of data, unless the developer inserts specific synchronization points which incur their own time penalty.
Moreover, each draw call may have a different data layout. Consequently, the CPU has had to be able to deal with different data layouts for each draw call. This can take up a significant percentage of the GPU's time. Recently, systems have been developed with increasingly more powerful GPU. In some cases the raw processing power of the GPU can exceed that of the CPU. However, utilization of this power is often limited by the CPU-GPU interaction.
It is within this context that aspects of the present disclosure arise.