Up until now there has been a multi-programming technique of running plural programs for a single central processing unit (CPU). For example, an operating system (OS) has a function of dividing the processing time of the CPU and allocates a process or a thread to the divided time to allow the CPU to operate plural processes or threads at the same time. The process and the thread are units in which a program is executed. Software is a set of the processes or the threads. In general, each process has an independent memory space and threads have a shared memory space.
Recently, an increasing number of apparatuses employ a multi-core processor system that is a computer having plural CPUs, in place of a single-core processor system that is a computer having a single CPU. High-speed processing is possible by allocating plural threads to plural CPUs in parallel.
In the case of such a parallel execution of the plural threads, synchronization processing is frequently executed to achieve thread-to-thread synchronization. The synchronization processing can be exclusive control processing or barrier synchronization processing.
The exclusive control processing is processing in which once one thread acquires a right of use of a resource, etc., the other threads are placed in wait state until the one thread releases the resource use right. For example, when plural threads access shared data, the exclusive control processing is added to the program. The barrier synchronization processing is processing in which processing of plural threads is stopped at a specific code position and, when all threads reach the specific code position, proceeds to the next processing. For example, when requiring a concurrent execution of plural threads from a specific position, the barrier synchronization processing is added to the program.
The OS provides a synchronous command to perform the synchronization processing to application software (app) in a library, etc. For example, the synchronous command to perform the exclusive control processing can be a Mutex and the synchronous command to perform the barrier synchronization processing can be a barrier synchronous command.
A CPU declaring the start of the synchronization processing sends, consequent to the synchronous command, a synchronous signal to a CPU that is to receive the synchronization processing. The CPU having received the synchronous signal sends to the CPU that sent the synchronous signal, a signal indicating completion of the synchronization processing. Hereafter, a signal indicating completion of the synchronization processing is referred to as a ready signal.
Relating to the synchronization processing, for example, a technique is disclosed that has a synchronization counting unit that counts the number of CPUs reaching a synchronous point during the thread-to-thread synchronization processing, to determine whether all the CPUs have reached the synchronous point. The synchronous point refers to a position where the synchronous command is inserted in an execution code. Relating to CPUs' register synchronization processing, for example, a technique for speculative execution is disclosed that, each time a parent thread general-purpose register is written to after thread copying, sends a value of the updated general-purpose register from the parent thread CPU to a child thread CPU (see, e.g., Japanese Laid-Open Patent Publication Nos. H7-200486 and 2003-29986).
Although rapid confirmation of completion of the synchronization processing is possible in the above techniques, unnecessary waiting occurs if redundant synchronization processing is set by the designer, resulting in decreased performance.