A driver for a processing unit is a component which takes a command stream from an application, such as by an operating system component, as input, and generates another command stream as output. The command stream that is output may be submitted to an operating system in order to be sent to the hardware for processing. For example, a graphics driver for a graphics processing unit (GPU) is a component which takes a command stream from an application for graphics processing as an input, and generates another command stream as an output to the GPU to enable the GPU to perform the needed processing. This output command stream may be submitted to an operating system and other driver components in order to be sent to the GPU for execution.
Submission of the command stream, such as the processing of the command stream by the operating system and level drivers, for example, may use a significant percentage of the cost of delivering the command stream from the application to the processing unit.
Generally, modern computing systems include many individual processing unit cores. The usage of these cores, in any given snapshot, may not be well-balanced. That is, some of the cores may be heavily loaded while others are not. A benefit in efficiency may result in moving work from a heavily loaded processing unit core to a more lightly loaded core.
In addition, on a central processing unit (CPU) bottlenecked application, the driver thread may be the factor limiting performance. A driver thread is the CPU sequence of instructions which has as its results a command sequence for the GPU. For example, the conversion from Application Programming Interface (API) command to GPU command is a sequential process which may be bottlenecked by the CPU. The ability to alleviate this bottleneck may therefore increase performance.
Present solutions to this problem focus on reducing the bottleneck at the top of the driver stack. Therefore, a need exists to alleviate this bottleneck at other places within the driver stack to eliminate the visibility of the time spent in the operating system components.