This disclosure relates generally to computer systems operations. More particularly, but not by way of limitation, this disclosure relates to a technique for increasing the speed of a graphics processing unit's (GPU's) context switch operation. The parallel nature of GPUs can allow data parallel computations to be carried out at rates that are orders of magnitude greater than those offered by a traditional central processing unit (CPU). However, while CPUs may be interrupted to handle higher priority tasks quickly (i.e., with low latency), no such mechanism currently exists for GPUs. That is, GPUs typically execute one task at a time and do not switch between tasks. To switch a GPU from one (lower priority) task to another (higher priority) task, the GPU must be permitted to complete its current computation or to “flush” its pipeline. One of ordinary skill in the art will understand that the “task granularity” may be tied to a system's GPU architecture. In general, immediate-mode GPU architectures typically provide a finer level of granularity than do tiled mode GPU architectures. The required time to effect a GPU task switch can be significant especially in mobile devices with limited computational power (e.g., portable music devices, mobile telephones, electronic watches, digital cameras). For example, GPU task switch times on these types of devices may range between microseconds to milliseconds.