1. Field of the Invention
This invention relates to privatizing arrays, and particularly to a method of partially copying first and last private arrays for parallelized loop based on array data flow.
2. Description of Background
Array data flow analysis is a known method for privatizing arrays. Such is generally discussed in publications entitled “Automatic Array Privatization and Demand-driven Symbolic Analysis”, Peng Tu, 1995, Thesis, University of Illinois at Urbana-Champaign; “Efficient Interprocedural Array Data-Flow Analysis for Automatic Program Parallelization”, Junjie Gu and Zhiyuan Li, IEEE Transactions on Software Engineering vol 26 number 3, pages 244-261, March 2000; and “Automatic Parallelization of Recursive Procedures”, Manish Gupta and Sayak Mukhopadhyay and Navin Sinha, 1999, Proceedings of International Conference on Parallel Architectures and Compilation Techniques; each of which are incorporated by reference.
To prove that it is safe to privatize an array, every use of an array element must have a dominating definition in the same loop iteration. That is, every path from the top of the loop to the use of the array element must pass the definition site of the same array element. However, the approaches taken for first and last privatization in these methods are inefficient. In particular, first privatization of arrays is discouraged due to overhead problems.
The following example shows a loop that requires first and last privatization:
a = 5do 40 j = 1, ndo 20 i = 2 , ma(i) = b(i) + c(i)20continuedo 30 i = 2, ma(i) = a(i−1) + 430continue40continueprint *, a(2:m)
Elements 2 to m of array a are defined in the first inner loop, while the second inner loop accesses elements 1 to m of the same array. The reference to the first element of a is exposed to the initial definition of a to 5. To privatize a, the initial elements of a must be copied to each of the private versions of a. This is known as first privatization in OpenMP terms.
Similarly, elements 2 to m of a are used after the loop. To privatize a correctly, the final iteration of the loop must copy its private version of a back to the shared version. This is known as last privatization in OpenMP terms.
In both cases, the existing algorithms copy the entire contents of the array in or out. This is undesirable because copying large arrays will be expensive. Since array sizes are often symbolic, it will be difficult for a compiler to know at compile-time whether the gains of will outweigh the cost of copying the array. The first privatization case is further complicated because, if the loop is parallelized, multiple processors will be copying from the shared global array, which could result in cache conflicts. For these reasons, first privatization of arrays is typically not done.
These types of opportunities are typically seen in biological algorithms that set initial conditions on the boundary of an array and then iteratively fill in the remaining values.