The present invention relates to a distribution processing for a multi processor system, and more particularly to a method of rearranging data for the distribution processing and a compiler that performs the data rearrangement.
As a technique for speeding up data redistribution in a distributed-memory type multi processor system, as mentioned on pp. 170-171 in Daniel J. Palermo, Eugene W. Hodges IV, and Prithviraj Banerjee. xe2x80x9cDynamic Data Partitioning for Distributed-Memory Muticomputersxe2x80x9d, Journal of Parallel and Distributed Computing 38, pp. 158-175 (1996), there has existed a technique of allowing a starting-up overhead in the data redistribution to be reduced by aggregation of the data redistribution operation. Here, the aggregation means the following: A plurality of arrays that are to undergo the data redistribution or a non-consecutive portion in one array that is to undergo the data redistribution are copied to a consecutive address region in one buffer at the time of executing the program, thereby performing the data redistribution toward the buffer only a single time.
Also, as a technique for reducing an inter-cache conflict that is referred to as xe2x80x9cfalse sharingxe2x80x9d in a shared-memory type multi processor system, as described in Jennifer M. Anderson, Saman P. Amarasinghe, and Monica S. Lam. xe2x80x9cData and Communication Transformations for Multi processorsxe2x80x9d, PPOPP"" 95, pp. 166-178 (1995), there has existed the following technique: Of dimensions of an array, a dimension used for specifying array elements the addresses of which are consecutive is partitioned into dimensions the number of which is equal to that of the processors, thus partitioning the dimension into two dimensions, i.e., a dimension including only the subscripts that a single processor accesses and a dimension designating the processor number. Moreover, the array in the latter dimension is expanded so that the latter dimension becomes a dimension where an increase in the addresses is the most gradual one, thereby reducing an opportunity that data in the caches spreads across two processors.
In the first conventional technique described above, since the data is copied to the buffer at the time of executing the program, there exists a problem that it takes an extra time to make the copy.
Also, in the first conventional technique described above, there is a possibility that sizes of the arrays are varied before and after the data redistribution. Accordingly, it is required to ensure and open the array region every time the data redistribution is performed, and thus there exists a problem that it takes an extra time therefor.
Further, in the first conventional technique, the array format changes by the array region allotment and release and before and after the redistribution. It requires the copy of subarray which assigned to the same processor before and after the redistribution, which copy is not necessary in normal process, and it also requires extra time.
In the second prior art, in case that a program has two different loops and respective loop access the array differently, the data conversion is made to reduce the false sharing only for one loop. It is still a problem that the false sharing may highly occur for the other loop.
It is an object of the present invention to eliminate a processing of copying data to a buffer and thus to shorten an execution time of a program, the data being sent and received among the processors when a distribution processing is executed in a multi processor system having distributed-memories.
Also, it is another object of the present invention to eliminate a processing of ensuring and opening an array region at the time of redistributing data to each of the processors, and thus to shorten the execution time of a program.
Also, it is still another object of the present invention to eliminate copy of a subarray allotted to one and the same processor before and after the data redistribution, and thereby to shorten the execution time of a program.
Also, it is a further object of the present invention to reduce the inter-cache conflict toward all the loops when a distribution processing is executed in a multi processor system having a shared-memory, and thereby to shorten the execution time of a program.
The present invention provides a data rearranging method for accomplishing the above-described objects.
In a data rearranging method for performing a distribution processing of an array in a multi processor system including a plurality of processors each of which has an independent memory, there are included the following steps of collecting data distribution formats that indicate schemes of allocating the respective array elements to the memories at the respective points in a program, and subdividing the array into sets that include subarrays, each of the subarrays belonging to only one processor in any one of the data distribution formats obtained at the above-described format collecting step, and increasing the number of dimensions of the original array and rearranging sequence of array elements so that the array elements within each of the above-described subarrays possess consecutive addresses.
Also, in a data rearranging method for performing a distribution processing of an array in a multi processor system including a plurality of processors that share a single memory, there are included the following steps of collecting data assignment formats that, for the array to be distribution-processed by the plurality of processors, indicate schemes of computation-assigning the respective array elements to the plurality of processors at the respective points in a program, and subdividing the array into a set of subarrays, each of the subarrays being assigned to only one processor in any one of the data assignment formats obtained at the above-described format collecting step, and increasing the number of dimensions of the original array and rearranging sequence of array elements so that the array elements within each of the above-described subarrays possess consecutive addresses.