1. Technical Field
The present invention relates to a computer having a common memory type multiprocessor structure having a plurality of processors with a cache memory, and a control method thereof, and the invention particularly relates to the computer and the control method for improving command executing efficiency of a spinwait command to be used for a synchronizing process.
2. Description of the Related Arts
Conventionally, in a common memory type multiprocessor system in which a main memory is shared by a plurality of processors having a cache memory, spinwait is frequently used for a synchronizing process between the processors.
A typical example in which spinwait is used for the synchronizing process between the processors is spin lock. As shown in FIG. 1, a command process of spin lock is mostly realized by a procedures of a verifying unit 100 and a setting unit 102. A processes or a thread which reaches a lock acquiring point loads a variable X at step S1, and the verifying unit 100 determines at step S2 whether the variable X is, for example, 0 which is a value representing that the lock acquiring is possible. The verifying unit 100 executes spinwait which waits for spin for repeating the steps S1 and S2 until X becomes 0. The setting unit 102 sets the lock variable value X to 1 at step S3 using an interlock command such as “test_and_set” or “compare_and_jump” so as to come out of the spin lock process. When the setting of the lock variable value X to 1 is failed, the spinwait is continued in the verifying unit 100. In such a manner, the spinwait is frequently used in the verifying unit 100 which verifies the spin lock for checking whether lock acquisition is possible. Besides the spin lock, the synchronizing process between the processors in the common memory type multiprocessor system includes barrier synchronization which uses spinwait. The barrier synchronization is such that processes or threads to be synchronized have a rendezvous with each other at a synchronizing point, and in the case of the memory-based barrier synchronization, it is mostly realized by a procedure in FIG. 2. At step S1 of the setting unit 104, before the processes or the threads reach the synchronizing point, the lock variable X is 0. When the processes or the threads reach the synchronizing point, the setting unit 104 sets a bit of the lock variable X related with the corresponding process or the thread to 1 using the interlock command such as “test_and_set” or “compare_and_jump”, and the sequence goes to verifying unit 106. The verifying unit 106 loads the lock variable X at step S2, and executes spinwait for spinning until the bit of all the processes or the all the threads to be barrier-synchronized becomes 1, namely, until all-variables X become 1 at step S3. When all the processes or all the threads reach the synchronization point, all variables X become 1, the sequence comes out of the spinwait process in the synchronized state, so as to proceed to a next step. Even in the barrier synchronization, the spinwait is frequently used in the verifying unit 106 for checking whether all the processes or all the threads reach the synchronization point.
Further, the spinwait is used also for synchronization with an I/O device (input/output device). In a normal I/O process, interruption is generally used for the synchronization between the processor and the I/O device. This is because the I/O device operates very slower than the processor and a loss, which is caused by that the processor continuously waits for a response from the I/O device, is large. Due to speeding-up of the I/O devices, however, the use of interruption has a negative effect on some I/O devices. In general, since interruption causes large overhead (delay), an original speed of the I/O devices cannot be efficiently used. For this reason, some high-speed I/O devices start to adopt the synchronizing process using spinwait.
The execution of spinwait is, however, wasteful. In the spinwait, the execution of one command string is repeated until a variable value which is a wait end condition is changed into a desired value by another processor or another agent such as an I/O device. It is not uncommon that a number of times of spin to be a number of repetitions occasionally becomes several hundred or several thousand, and this means that the spin processor is used wastefully. From a viewpoint of power consumption, the electric power is consumed wastefully during the spinwait.
Further, in an SMT (Simultaneous Multi-Threading architecture) processor which can execute a plurality of threads simultaneously, a thread which is executing spinwait possibly hinders another thread which is being executed. In general, in the SMT processor, a processor resource is shared between threads. Generally, the processor resource is allocated to the execution threads not uniformly, and a number and a quantity of the processor resources to be allocated changes according to statuses of the threads. Various methods of determining allocation of processor resource are suggested, but for example, the following approaches are present:
(1) reducing allocation of resources to threads where mis-cache occurs:
(2) reducing allocation of resources to threads with a lot of commands executed speculatively; and
(3) reducing allocation of resources to threads with a lot of commands registered in a reservation station.
These approaches are based on concept that the processor resource is preferentially allocated to a thread having strong possibility of executing a command more smoothly. The spinwait can execute the command very smoothly. This is because since only the same command string is repeated executed, mis-cache does not occur and forecast of branching is not failed. In the SMT processors, the processor resource is preferentially allocated to a thread which is executing the spinwait by chance. As a result, allocation of the processor resources to another threads which seem to execute a command string with high productivity is reduced in comparison with the thread which is executing the spinwait, and thus the performance of the processor is possibly deteriorated.
As mentioned above, the spinwait has the following problems:
(1) wasteful power consumption; and
(2) strong possibility that the performance of the SMT process is deteriorated.
A countermeasure against these problems includes a method that after starting of the spinwait is detected, the execution of spinwait is stopped, and the right condition for canceling the spinwait is posted by interruption so that the spinwait is restarted. It comes to nothing, however, that the right condition for canceling the spinwait is posted by interruption. The spinwait is originally adopted in order to reduce delay, and the use of interruption increases the time and the cost.
Further, a method of providing a hardware for exclusive use of synchronization is present. In this case, however, the cost of the hardware becomes high. At the present day, the memory-based synchronizing process is general due to the historical background such that the reduction in the cost caused by executing the synchronizing process using a general-purpose device (memory) has many advantages.
It is an object of the present invention to provide a computer and a control method which eliminate the waste of the electric power and the processor resources caused by the execution of spinwait so as to heighten the command executing efficiency.