Parallel programming models may support one or both of task-parallelism and data-parallelism in order to solve computational problems. Task-parallelism may allow computational problems to be divided up into multiple tasks. The tasks may be executed sequentially, concurrently, and/or in parallel on one or more processor cores. Data-parallelism may allow the same set of operations to be performed in parallel on different sets of data by distributing the data to different processing elements and causing each of the processing elements to perform the same set of operations on their assigned set of data.
Multi-core processors may be used to support task-parallelism where each core is configured to execute a particular task. In some cases, one or more of the cores in a multi-core processor may be a single instruction, multiple data (SIMD) processor or a single program, multiple data (SPMD) processor that may include multiple processing elements to support data-parallelism. In such cases, tasks that support data-level parallelism may be able to be executed either sequentially or in parallel on a multi-core processor.
Several different types of processors may support task-parallelism and/or data-parallelism including a multi-core central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), a Cell Broadband Engine (Cell/B.E.) processor, etc. Although GPUs were traditionally designed to support the rendering of three-dimensional (3D) graphics to a display, the programmable shader architecture included in many modern GPUs can be used to efficiently support both task-parallelism and data-parallelism found in general-purpose, non-graphics specific programs that are programmed using a parallel programming model. Using the parallel architecture of a GPU to execute non-graphics specific programs may be referred to as general-purpose computing on graphics processing units (GPGPU).