In parallel computing environments, multiple processors are used to execute parallel processes. Data parallel computation involves the assignment of portions of a data set as input to each of multiple parallel processes so that each data portion may be processed in parallel. Often, data parallel computation is offloaded to specialized hardware or devices such as, for example, a General-Purpose Graphics Processing Unit (GPGPU).
One way this offloading may occur is via the use of DirectX's Application Program Interface, specifically the DirectCompute method. The user authors a program using a higher level language. The program is then compiled into a program often called a data parallel kernel or “shader”. The kernel is then loaded onto the device for execution using DirectX Application Program Interfaces (APIs).