This disclosure relates generally to scheduling commands for a graphics processor. More particularly, but not by way of limitation, this disclosure relates to out-of-order command scheduling for a graphics processor based on command dependency.
Computers, mobile devices, and other computing systems typically have at least one programmable processor, such as a central processing unit (CPU) and other programmable processors specialized for performing certain processes or functions (e.g., graphics processing). Examples of a programmable processor specialized to perform graphics processing operations include a GPU, a digital signal processor (DSP), a field programmable gate array (FPGA), and/or a CPU emulating a GPU. GPUs, in particular, comprise multiple execution cores (also referred to as shader cores) designed to execute commands on parallel data streams, making them more effective than general-purpose processors for operations that process large blocks of data in parallel. For instance, a CPU functions as a host and hands-off specialized parallel tasks to the GPUs. Specifically, a CPU can execute an application stored in system memory that includes graphics data associated with a video frame. Rather than processing the graphics data, the CPU forwards the graphics data to the GPU for processing; thereby, freeing the CPU to perform other tasks concurrently with the GPU's processing of the graphics data.
User space applications typically utilize a graphics application program interface (API) to access (e.g., indirect or near-direct access) a GPU for the purposes of improving graphics and compute operations. To access the GPU, a user space application institutes API calls that generate a series of commands for a GPU to execute. For example, the graphics API causes a CPU to encode commands within a command buffer that is eventually submitted to the GPU for execution. The order the CPU submits the commands generally determines the order the GPU executes the commands (e.g., first-in-first out (FIFO)). However, because a GPU are intrinsically parallel, the order the CPU submits commands to the GPU may not be the most efficient manner for the GPU to execute the commands. In some situations, the order the CPU submits commands to the GPU could cause “pipeline bubbles” that increase processing latency and underutilizes the GPU's parallel architecture.