Computer applications having concurrent threads and executed on multiple processors present great promise for increased performance but also present great challenges to developers. The growth of raw sequential processing power has flattened as processor manufacturers have reached roadblocks in providing significant increases to processor clock frequency. Processors continue to evolve, but the current focus for improving processor power is to provide multiple processor cores on a single die to increase processor throughput. Sequential applications that have previously benefited from increased clock speed obtain significantly less scaling as the number of processor cores increase. In order to take advantage of multiple core systems, concurrent (or parallel) applications are written to include concurrent threads distributed over the cores. Parallelizing applications, however, is challenging in that many common tools, techniques, programming languages, frameworks, and even the developers themselves, are adapted to create sequential programs.
To write effective parallel code, a developer often identifies opportunities for the expression of parallelism and then maps the execution of the code to the multiple core hardware. These tasks can be time consuming, difficult, and error-prone because there are so many independent factors to track. Current tools enable a developer to determine a percentage of processor use as a function of time. These tools are intended for sequential applications as the tools provide no meaningful insight on opportunities to express parallelism and provide no information on how processor cores are utilized. Understanding the behavior of parallel applications and their interactions with other processes that are sharing the processing resources of a computing device is a challenge with the current developer tools.