1. Field of the Invention
The present invention relates to a parallel process execution method, a multiprocessor computer, a parallel process execution program, and a storage medium storing that program. More particularly, the present invention relates to a parallel process execution method that executes parallel processes and other processes in a time-shared manner, a multiprocessor computer to perform that method, a parallel process execution program to make a computer perform that method, and a storage medium storing that program.
2. Description of the Related Art
A computer composed of a plurality of processors (multiprocessor computer) can execute a single program with a plurality of processors in a parallel fashion. Programs that can be executed in parallel will hereafter be called parallel programs. To execute a parallel program, a plurality of parallel processes are produced from it, which can run concurrently with each other. Those parallel processes produced are given to different processors for parallel execution. The processors accomplish a series of processing tasks encoded in the parallel program, exchanging data with each other. Here we use the term “process” to refer to a unit task containing at least one thread. By the term “processors,” we mean processing elements which are known as, for example, central processing unit (CPU) or micro processing units (MPU). In the rest of this section, we use the term “CPU” to refer to such processing elements for reasons of expediency.
Parallel processes have some checkpoints at which each process communicate, or synchronize, with other related processes by exchanging data. Think of two CPUs executing parallel processes that involve each other's data. These two CPUs first proceed on their own until they reach a checkpoint and become ready to exchange their data. If one process has reached the checkpoint earlier than the other, the CPU of that process has to wait until the peer process also reaches the checkpoint. This situation of the former CPU is referred to as the synchronization wait state.
Besides the synchronization wait, data processing on a CPU involves other types of wait times such as those related to input/output (I/O) operations. If those CPU wait times can be used to execute other processes, it will contribute positively to the efficiency of the entire system. This idea is actually implemented in the following way: each CPU is configured to operate in time sharing mode, and when the current process on a CPU has entered a wait state, some other process is assigned to that CPU. This additional process may be a parallel process or a non-parallel process that is supposed to run on a single CPU.
Consider that, since one parallel process has reached a checkpoint for data exchange with a peer process, the CPU is now running some other process using the synchronization wait period that follows. It could happen in this situation that the CPU is still engaged in that extra process even if the peer parallel process has also reached the checkpoint, and if this is the case, the peer CPU has to wait for synchronization. Such synchronization wait times would cause a reduction in the efficiency of the computer system.
To address the above issue, a process scheduling method is disclosed in the Unexamined Japanese Patent Publication No. 10-74150 (1998), in which a time-sharing computer system is configured to cause all constituent CPUs to start and stop parallel processes and other processes simultaneously at predetermined unit intervals (called “phase”). That is, the CPUs execute a plurality of parallel processes produced from a certain parallel program, stating and stopping them all at the same time. While synchronization wait times may arise in the course of parallel processes, the disclosed method keeps their time length equal to that in the case without time sharing. As a result, it minimizes the amount of waiting time for synchronization between parallel processes constituting a parallel program, thus preventing the system's efficiency from decreasing.
Meanwhile, some kinds of computer processes require that their turnaround time (the time from start to complete) be guaranteed. One example of such processes is weather data analysis. This process has to analyze a huge amount of weather data in a limited time frame, i.e., a predetermined time before scheduled delivery of a new piece of weather forecast information.
The process scheduling method proposed in the above-mentioned publication, however, is unable to provide guaranteed turnaround time for each parallel program because of its fixed-length phase. For example, there may be such a parallel program that would need 50 percent of the computation power of a multiprocessor computer to ensure its required turnaround time. The above-described process scheduling method can only allocate one phase (e.g., 10 percent) of time resources to that parallel program and thus is unable to guarantee the turnaround time.