Typical parallel processing systems include a plurality of processors or nodes which are coupled together via a communications switch. One example of a parallel processing system is Scalable Power Parallel Systems 9076-SP1 offered by International Business Machines Corporation. SP1 provides the flexibility of running large parallel jobs, as well as serial jobs, that utilize the processors as if they were standard workstations. To facilitate this diversity and allow the systems to bootstrap on progress made in the workstation area, each processor includes a full operating system environment. In one instance, this environment is UNIX based and is referred to as AIX. AIX uses daemon processes (i.e., subroutines) to provide operating system services for the user, which are scheduled periodically.
On a single processor workstation the effect of scheduling these daemons is the cost of swapping to the daemon process plus the time that the daemon runs. The effect of this workload is, for example, a small (5%) decrease in available user cycles. The effect is linear for the single processor application. That is, an increase in daemon activity results in a direct (linear) decrease in user cycles. The effect on parallel jobs, on the other hand, is very non-linear. This non-linear effect is caused by the fact that the daemons are not synchronized between the processors. Thus, at any given time, on one or more processors, daemon activity is taking place, thereby causing the user applications to sleep. The effect is a degradation in system performance.
One mechanism for synchronizing nodes within a system is described in, for instance, U.S. Pat. No. 4,914,657, entitled "Operations Controller For A Fault Tolerant Multiple Node Processing System," issued on Apr. 3, 1990 and assigned to Allied-Signal, Inc. The synchronizer described in the above-referenced patent establishes and maintains synchronization between all of the operation controllers in the system. The multi-computer architecture uses loose synchronization which is accomplished by synchronous rounds of message transmission by each node in the system. In this method, each synchronizer detects and time stamps each time dependent message received by its own node. These time dependent messages are transmitted by every other node in the system at predetermined intervals and they are received by all the other nodes in the system. As a result of the wrap-around interconnection, a node will receive its own time dependent messages along with the other time dependent messages sent by the other nodes. The time stamps on a nodes own time dependent message is compared with the time stamps on all of the other time dependent messages in order to maintain synchronization among the nodes.
Thus, communication between the nodes is necessary. This communication results in processing overhead and degrades system performance. Therefore, a need still exists for a synchronization technique which does not degrade system performance and does not require explicit communication between the nodes.