SPMD (Single Program Multiple Data) is a popular parallel programming paradigm. Typically, SPMD-style programs or the like have a barrier synchronization primitive that can be used to partition the program into a sequence of parallel phases. When a thread reaches a barrier statement it cannot proceed until all other threads have arrived at the barrier statement. Barriers are textually aligned if all threads must reach the same textual barrier statement before they can proceed. A barrier synchronization error occurs, for example, if a thread bypasses a barrier, leaving the remaining threads stalled.
Popular parallel programming models, such as MPI and OpenMP, allow barriers to be textually unaligned. Textually unaligned barriers make it difficult for the programmer to understand the synchronization phases in the program, and they can easily lead to synchronization errors. MPI (The Message Passing Interface (MPI) standard. http://wwwunix. mcs.anl.gov/mpi/) and OpenMP (OpenMP C/C++ Manual. http://www. openmp.org/specs/), two widely used parallel programming models, place few or, in the case of MPI, no constraints on the placement of barrier statements in the program. Barrier statements may be textually unaligned making it more difficult for programmers to understand the synchronization structure of the program and, thus, easier to write programs with synchronization errors. Textually unaligned barriers also hinder concurrency analysis (Evelyn Duesterwald and Mary Lou Soffa. Concurrency analysis in the presence of procedures using a data-flow framework. In Proceedings of the Symposium on Testing, Analysis, and Verification, pages 36-48, 1991; Stephen P. Masticola and Barbara C. Ryder. Nonconcurrency analysis. In Proceedings of the Fourth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, pages 129-138, San Diego, Calif., May 1993; Tor E. Jeremiassen and Susan J. Eggers. Static analysis of barrier synchronization in explicitly parallel systems. In Proceedings of the IFIP WG 10.3 Working Conference on Parallel Architectures and Compilation Techniques, PACT '94, pages 171-180, Montr´eal, Qu´ebec, August 1994. North-Holland Publishing Company; Arvind Krishnamurthy and Katherine Yelick. Analyses and optimizations for shared address space programs. J. Parallel Distrib. Comput., 38(2):130-144, 1996; Yuan Lin. Static nonconcurrency analysis of openmp programs. In First International Workshop on OpenMP, 2005) because understanding which barrier statements form a common synchronization point is a prerequisite to analyzing the ordering constraints imposed by the program. Some concurrency analyses therefore require barriers to be named or textually aligned (Arvind Krishnamurthy and Katherine Yelick. Analyses and optimizations for shared address space programs. J. Parallel Distrib. Comput., 38(2):130-144, 1996; Tor E. Jeremiassen and Susan J. Eggers. Static analysis of barrier synchronization in explicitly parallel systems. In Proceedings of the IFIP WG 10.3 Working Conference on Parallel Architectures and Compilation Techniques, PACT '94, pages 171-180, Montr´eal, Qu´ebec, August 1994. North-Holland Publishing Company; Yuan Lin, Static nonconcurrency analysis of openmp programs. In First International Workshop on OpenMP, 2005).
A previous work on verifying barrier synchronization by Aiken and Gay (Alexander Aiken and David Gay. Barrier inference. In Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 342-354, 1998) on a barrier inference rule system detects a class of synchronization errors, however, they require user annotations to handle procedures and their analysis does not explicitly compute the matching function among barriers. While Aiken and Gay address verification of textually unaligned barriers and the related problem of determining multi-valued expressions with a set of inference rules implemented for Split-C (A Krishnamoorthy, U Culler, A Dusseau, S Goldstein, S Lumetta, T von Eicken, and K Yelick. Parallel Programming in Split-C. In Supercomputing '93 Proceedings, pages 262-273, November 1993), their rule system cannot automatically handle procedures and assumes user annotations to describe the effect of procedures.
There have been other approaches to verifying synchronization in parallel programs using model checking (Stephen F. Siegel, Anastasia Mironova, George S. Avrunin, and Lori A. Clarke. Using model checking with symbolic execution to verify parallel numerical programs. In Proceedings of the 2006 International Symposium on Software Testing and Analysis, pages 157-168, 2006; Stephen F. Siegel and George S. Avrunin. Modeling wildcard free mpi programs for verification, In Proceedings of the tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 95-106, 2005). The techniques based on model checking do not share the assumption of structural correctness but they are more expensive resulting in scalability problems. There have also been some efforts on static checking of shared memory programs. One such example is Calvin (Cormac Flanagan, Stephen N. Freund, Shaz Qadeer, and Sanjit A. Seshia. Modular verification of multithreaded programs. Theor. Comput. Sci., 338(1-3):153-183, 2005), which is based on automatic theorem proving.
Other related work includes barrier optimization approaches that optimize the usage of barriers (Alain Darte and Robert Schreiber. A linear-time algorithm for optimal barrier placement. In Proceedings of the tenth ACM SIGPLAN symposium on Principles and Practice of Parallel Programming, pages 26-35, 2005; Chau-Wen Tseng. Compiler optimizations for eliminating barrier synchronization. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, pages 144-155, Santa Barbara, Calif., July 1995; Michael O'Boyle and Elena Stohr. Compile time barrier synchronization minimization. IEEE Trans. Parallel Distrib. Syst., 13(6):529-543, 2002) by eliminating unnecessary barriers or optimizing the placement of barriers. Some research work identifies communication patterns, such as send/receive pairs, for MPI programs (Shuyi Shao, Alex K. Jones, and Rami Melhem. A compiler based communication analysis approach for multiprocessor systems, 2006).
The multi-valued expression problem has first been addressed by the inference rule system by Aiken and Gay. Aiken and Gay suggest to introduce a single qualifier as was done in the Titanium language (P. Hulfinger, D. Bonachea, K. Datta, D. Gay, S. Graham, B. Liblit, G. Pike, J. Su, and K. Yelick. Titanium language reference manual. Technical Report UCB/EECS-2005-15, U. C. Berkeley, 2005) to explicitly describe expressions that are single-valued. There has been a body of work on concurrency analysis of parallel programs, including SPMD programs (David Callahan, Ken Kennedy, and Jaspal Subhlok. Analysis of event synchronization in a parallel programming tool. In Proceedings of the Second ACM SIGPLAN Symposiumon Principles & Practice of Parallel Programming, pages 21-30, Seattle, Wash., March 1990; Evelyn Duesterwald and Mary Lou Soffa. Concurrency analysis in the presence of procedures using a data-flow framework. In Proceedings of the Symposium on Testing, Analysis, and Verification, pages 36-48, 1991; Stephen P. Masticola and Barbara G. Ryder. Nonconcurrency analysis. In Proceedings of the Fourth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, pages 129-138, San Diego, Calif., May 1993). Concurrency analysis uses the synchronization constructs in the program to determine which portions of the program may execute in parallel. Some concurrency analyses focus on analyzing the barriers in the program to establish concurrency information (Tor E. Jeremiassen and Susan J. Eggers. Static analysis of barrier synchronization in explicitly parallel systems. In Proceedings of the IFIP WG 10.3 Working Conference on Parallel Architectures and Compilation Techniques, PACT '94, pages 171-180, Montr´eal, Qu´ebec, August 1994. North-Holland Publishing Company; Arvind Krishnamurthy and Katherine Yelick. Analyses and optimizations for shared address space programs, J. Parallel Distrib. Comput., 38(2):130-144, 1996; Yuan Lin. Static nonconcurrency analysis of openmp programs. In First International Workshop on OpenMP, 2005). However, these approaches do not verify the correctness of barrier synchronization.