1. Field of the Invention
Embodiments of the present invention generally relate to binary code instrumentation and program parallelization and, more specifically, to a method and apparatus for automatic parallelization of certain program regions using a collection of analysis techniques.
2. Description of the Related Art
Over the last decade, a new standard for expressing parallelism has emerged. The OpenMP committee (www.openmp.org) has created special annotations for expressing the notion that a certain region or regions of a computer program may be executed in parallel. These annotations also provide a means to describe the ways in which program memory is used, so that parallel threads of execution can avoid interfering with one another.
Since modern compilers support this standard, modern computer systems are multiprocessing, and sequential computer applications include non-trivial amounts of implicit parallelism, one would expect the usage of these annotations to be widespread. Unfortunately, this is not yet possible because they can currently only be written by someone possessing a fair amount of expertise in the semantics of parallel execution, together with a fairly intimate knowledge of the application source.
Today's parallelizing compilers are generally built using techniques of static analysis, and, in particular, abstract interpretation. The goal of abstract interpretation is to prove something about all possible program runs. However, it is not currently possible to prove many interesting properties about today's applications due to their vast complexity. In fact, most parallelizing compilers cannot decide whether any interprocedural region is parallel, and must therefore fail to parallelize important program loops which may be parallel
For the purpose of parallelization, it is less important to prove that a property holds for all possible program runs than to prove that it holds for all “interesting” program runs. This is in contrast to, say, a safety analysis, where because the goal of the analysis is to verify that a property holds for all program runs, every possible run is “interesting”. A reasonable assumption about the predictability of memory access patterns can help to discover and exploit presumably parallel regions.
Therefore, there is a need in the art for a method and apparatus to determine where such presumably parallel regions occur and to optimize applications to exploit such regions.