The present invention relates to process management method and system for a memory shared multi-processor computer in which a plurality of processor elements share a main memory and each processor element extracts a process from a run queue which holds executable processes and executes it.
A computer system in which a plurality of processor elements are interconnected and each processor element parallelly conducts an operation is referred to as a multi-processor system. Of those, a system in which each processor element shares a memory (information storage means) through a shared bus is referred to as a memory shared type or a TCMP (Tightly Coupled Multi-Processor) system. (Computer Architecture: Design and Performance, Barry Wilkinson, 1991, Prentice Hall, Chapters 6 to 7.)
Hereinafter, the memory shared multi-processor system is simply referred to as a multi-processor system.
A process is a logical unit for execution of calculation based on a program and a program counter, a stack, external variables, a file descriptor and signals are managed independently for each process.
A thread is a unit of execution of the process by the processor. The thread is included in one of the processes and a process includes one or more threads.
Each thread manages the program counter, the stack and the tread management table for each thread, and shares the program, the external variables, the file descriptor and other information with other thread in the process. Whether the signals are to be handled by process unit or by thread unit depends on the implementation.
Usually, in the multi-processor system, the multi-processing or the multi-thread control is conducted to parallelly execute a plurality of processes or threads for each processor.
Each processor in the multi-processor computer of the multi-process system executes the processes assigned to itself parallelly by switching them.
For the management of the process to be executed by each processor, a data structure referred to as a run queue is used.
The run queue manages an executable process (a process which is sleeping to wait for any event and a process being executed by the processor are not executable processes).
Usually, the switching of the processes and the setting/removing of the processes for each run queue are conducted by an OS (operating system) operation on the system. An OS which conducts the multi-processing control on the multi-processor system is described in Modern Operating System, Andrew S. Tanenbaum, Prentice Hall, 1992, Chapter 12, pages 507-548.
In the system, in order to enhance the parallelism of the multi-processor to improve a throughput of the system, techniques to keep loads to the processors as uniformly as possible have been known. As one of them, a technique of processor-to-processor migration of the processes has been known.
When the migration is not conducted, the processor which executes the process is fixed to the processor which created the process during the period from the creation to the expiration of the process. When the process migration is conducted, the processor which executes the process is changed between the creation and the expiration of the process, namely, the process is migrated between processes.
By using the process migration, the unbalance of process loads among the processors may be relieved and the load balance may be adjusted by migrating the process from a high load processor to a low load processor.
The process migration may be statistically conducted by a compiler or the system may determine the migration at the execution of the process. In the present specification, dynamic load distribution which determines the migration at the execution without relying on the compiler is specifically discussed.
Examples of the migration among processors interconnected through a network are described in Load Distribution on Microkernels; D. Milojicic, P. Giese; IEEE Workshop on Future Trend in Distributed Computing; Lisbon, Portugal, September 1993, pp. 463-469, and Experiences with Load Distribution on Top of the March Microkernels; D. S. Milojicic, P. Giese and W. Zint; 4th Symposium on Experiences with Distributed and Multi-Processor Systems; San Diego, September 1993, pp. 19-36. In those references, a receiver initiate type process migration in which a low load processor (receiver) requests the migration of process to a high load processor (sender). In those references, the number of processes and the frequency of process-to-process communication are used as selection criteria of the sender/receiver processors.
In the prior art process migration, the migration among the computers interconnected through the network is primarily discussed. On the other hand, in a model such as a tightly coupled multi-processor in which the data transfer among the processors is conducted through a shared memory, a condition is different from that of the data transfer conducted through the network.
For example, in the computers through the network, even if information on the load balance of the computers is collected and it is determined to conduct the migration between particular nodes, a relatively long time is required before the migration is completed and the load balance of the system may change during that time and the migration which results in a small effect or is not necessary may be conducted. Accordingly, it is necessary to conduct the migration only for the unbalance which last for a long time. Further, since an overhead per migration is large, an affect to the throughput of the overall system should be taken into consideration.
In the tightly coupled multi-processor, the overhead (cost) for migrating the process between nodes is smaller than that of the multi-processor through the network. Further, a time required to complete the migration is shorter. Thus, a strategic method to conduct the migration more frequently to the variation of load may be adopted.