Closely-coupled processors or hardware resources will become widely available within the near future. Examples of such closely-coupled processors (or hardware resources) may include additional processors, threads in a particular processor, additional cores in a central processing unit, additional processors mounted on the same substrate or board, and/or such devices provided within computers connected by a network fabric into a cluster, a grid, or a collection of resources.
Certain computations (e.g., parallel processing or parallel programming) may benefit from the availability of such hardware resources. For example, a complex simulation may run faster if the simulation is divided into portions and the portions are simultaneously run on a number of processing devices in a parallel fashion. Parallel computing arrangements may include a controller that determines how an application should be divided and what application portions go to which parallel processors. For example, a host computer that is running a simulation may act as the controller for a number of parallel processors. Parallel processors may receive instructions and/or data from the controller and may return a result to the controller.
Some serial programs include profiler infrastructures that collect statistics and other information about the execution of the serial program. The statistics may be used to detect performance problems associated with the serial program. In contrast, it may be difficult to locate where a performance bottleneck or algorithm deficiency occurs in a parallel program because there can be many more dimensions of data to collect.