Parallel processing of an object program is an important function of a multiprocessor for effectively using each processor and improving operating performance. There has been known an approach for generating a parallelized object program by grouping a loop into a plurality of subloops in units of repeated statement groups (hereinafter, referred to as iteration). Some conventional loop parallelization methods are used for parallelizing only loops without a data dependency (hereinafter, referred to as DOALL parallelization) (see Non-Patent Literature 1, for example). Here, a data dependency refers to a situation in which a certain statement B cannot be executed until another statement A completes. When a variable a causes the data dependency (for example, the statement A substitutes a certain value into the variable a, and the statement B refers to the variable a), such a situation is expressed as the loop has “a data dependency on the variable a”. The DOALL parallelization is aimed at a loop that does not contain a data dependency as shown in FIG. 9A, for example.
When the loop has a data dependency and the above statements A and B belong to different iterations, one of the iterations to which the statement B belongs is dependent on the other iteration to which the statement A belongs. When the loop has a data dependency between two or more iterations, parallelization of the loop is difficult since it is impossible to simply apply DOALL parallelization.
Data dependencies between iterations are classified into three: a data dependency that occurs when a value is substituted in an iteration and the substituted value is then referred to in the subsequent iteration (hereinafter, referred to as loop-carried dependency); a data dependency that occurs when a value of a variable is referred to in an iteration, and in the subsequent iteration, another value is substituted into the variable that has been referred to (hereinafter, referred to as pre-reference dependency); and a data dependency that occurs when a value is substituted into a variable in an iteration, and another value is then substituted into the same variable in the subsequent iteration (hereinafter, referred to as output dependency).
The following explains specific examples of data dependencies between iterations and problems that occur in parallelizing with reference to FIGS. 9A-9D. A loop shown in FIG. 9B has a loop-carried dependency on a variable a. In the case where a loop is sequentially executed, when an iteration using i=k+1 is executed, the variable a refers to the execution result of an iteration using i=k. In contrast, in the case where a loop is parallelized by being divided in units of iterations, each iteration is executed in an order different from an order in which the loop is sequentially executed. Accordingly, a value of the variable a is not necessarily x[k] when y[k+1]=a*b is executed. Therefore, there is a problem that the value of a variable y[k+1] is not the same as the result of sequential execution of the loop.
An array c shown in FIG. 9C is an example of a pre-reference dependency. In this example, it is assumed that values are substituted into c[2]-c[101] before a loop is executed. When a loop is divided in units of iterations to be parallelized, each iteration is executed in an order different from an order in which the loop is sequentially executed. An iteration using i=k+1 might therefore be executed prior to an iteration using i=k. In this case, since the result of iteration using i=k+1 is substituted into c[k+1] when the iteration using i=k is executed, there is a problem that the value of z[k] is not the same as the result of sequential execution of the loop.
An array g shown in FIG. 9D is an example of output dependency. When a loop is divided in units of iterations to be parallelized, each iteration is executed in an order different from an order in which the loop is sequentially executed. An iteration using i=k+1 might therefore be executed prior to an iteration using i=k. In this case, since the iteration using i=k overwrites the value of g[k+1], there is a problem that the value of g[k+1] is not the same as the result of sequential execution of the loop.
Patent Literature 1 discloses a conventional method for parallelizing a loop containing a data dependency. FIG. 10 shows a compiler device disclosed in Patent Literature 1. A compiler device 800 acquires a source program 700 that is for sequential execution and written in FORTRAN, for example, as input, and outputs an object program 900 for parallel execution. A loop parallelizing unit 820 included in the compiler device 800 includes DOALL parallelizing unit 821 and a DOACROSS parallelizing unit 822. The DOALL parallelizing unit 821 parallelizes a loop that can be parallelized using DOALL parallelization. The DOACROSS parallelizing unit 822 parallelizes, by inserting a statement for executing synchronization among processors into multiloop, a loop that can be parallelized using pipeline parallelization (hereinafter, referred to as DOACROSS parallelization).
Further, another conventional example is disclosed by Patent Literature 2. According to a compiler device disclosed by Patent Literature 2, when a loop has a pre-reference dependency, a value of a variable with a pre-reference dependency is stored in a temporary variable beforehand so that another value is not substituted into the variable with the pre-reference dependency before its value is used. In this case, the pre-reference dependency is solved by replacing the variable with the pre-reference dependency with the temporary variable, when the variable with the pre-reference dependency is referred to in a statement. As a result, the loop can be parallelized.