Modern computers (and related devices) typically produce graphical output using a sequence of tasks known as a graphics pipeline. These tasks start with a mathematical representation of an image to be produced and finish with pixel data suitable for display on a video screen or other output device. The tasks that perform this translation (i.e., the tasks included in a graphics pipeline) may be performed entirely by the host processor or processors included in a host computer system. Another common arrangement is to split the graphics pipeline so that the host processor performs only an initial subset of the pipeline tasks. The remaining tasks are then performed by a specialized graphics processor. Splitting the graphics pipeline often results in increased graphics throughput (due to the specialized abilities of the graphics processor). Splitting also generally results in increased throughput for the host processor (due to the decreased demands placed on the host processor).
In architectures where graphics processors are used, the initial subset of pipeline tasks are typically performed as part of user-mode, non-privileged, execution of the host processor. This means that these tasks may be included within a user process or application. It also means that these tasks may be replicated within a series of processes. Effectively, the graphics pipeline is modified so that a group of initial pipeline segments are all multiplexed to feed the graphics processor.
Use of a graphics processor also means that the output of the initial pipeline segment, or segments, must be transferred to become the input of the graphics processor. In an ideal architecture, this transfer would be accomplished at little or no cost. Unfortunately, in traditional architectures, access to the graphics processor cannot be accomplished as part of user-mode execution of the host processor. Instead, a user process or application that desires to send information to the host processor must do so as part of a system call. The system call invokes the operating system of the host processor and the operating system performs the transfer on behalf of the user process. The context switch from user-mode to privileged mode is time consuming and decreases the efficiency of the graphics process.
In addition to being time consuming, the use of a system call also tends to serialize the operation of the host and graphics processor. This follows because the use of a system call forces the operating system to act as a sort of arbitrator between the host and graphics processors. If the graphics processor finishes its current tasks, it is forced to wait until the operating system decides to transfer more work to the graphics processor. If the operating system is attending to other duties, available work may have to wait to be transferred. Thus, the host and graphics processors exhibit an unnecessary degree of interdependence and potential parallism remains un-exploited.