The architectures of High Performance Computer (“HPC”) systems are supporting increasing levels of parallelism in part because of advances in processor technology. An HPC system may have thousands of nodes with each node having 32, 64, or even more processors (e.g., cores). In addition, each processor may have hardware support for a large number of threads. The nodes may also have accelerators such as GPUs and SIMD units that provide support for multithreading and vectorization.
Current computer programs are typically developed to use a single level of parallelism. As a result, these computer programs cannot take advantage of the increasing numbers of cores and threads. These computer programs will need to be converted to take advantage of more computing resources by adding additional levels of parallelism. Because of the complexities of the architectures of such HPC systems and because of the increasing complexity of computer programs, it can be a challenge to convert existing, or even develop new, computer programs that take advantage of the high level of parallelism. Although significant advances in compiler technology have been made in support of increased parallelism, compilers still depend in large part on programmers to provide directives to help guide the compilers on determining which portions of a program can be parallelized. Similarly, because of these increased complexities in the architectures and computer programs, programmers can find it challenging to generate code to take advantage of such parallelism or to even determine what compiler directives would be effective at guiding a compiler. An incorrect compiler directive or incorrect decision made by a compiler may result in a compiled program with the wrong behavior, which can be very difficult to detect and correct. Moreover, it can be difficult to even determine whether such complex computer programs are behaving correctly.