U.S. Pat. No. 5,852,734 describes a parallel compiler for increasing the speed of, program execution by decomposing a loop onto a plurality of processors and executing them in parallel. First, a loop in a source program is located which is to be executed in parallel. This loop is then analyzed for data dependence. The result of the analysis is used for calculating data dependence vectors. Then all areas of the index executed in the loop are decomposed and assigned to a number of processors. Further, it is determined whether data needs to be transferred between processors. Based on the array index space, communication vectors are calculated. Data dependence vectors and communication vectors are ANDed to calculate communication dependence vectors. Then, the manner of communication of operands and loop execution are determined based on the values of communication dependence vectors.
A problem with the arrangement of U.S. Pat. No. 5,852,734, and indeed other known parallelization systems is that they are very difficult to de-bug. This is a significant practical barrier. This is because known systems are generally non-deterministic in nature. This means that some parts of the parallelized algorithm are dependent on other parts finishing first. For various reasons, such as random processor errors, the timing of parts of the program can be disrupted. This can cause the parallelized algorithm to fail. Because timing errors can be due to random events, tracking down this type of problem is extremely difficult. This means that system reliability cannot be guaranteed.
Furthermore, other known systems assume a single shared memory space. New architectures offer multiple memory spaces to increase execution performance. It is therefore essential to support multiple memory spaces within any high performance parallelization system, which claims to work on multiple hardware platforms.