The present invention relates to a spinlock method for interprocess exclusive control in a shared-memory multi-processor (multicore) system and, more particularly, relates to a multi-processor system realizing reduced power consumption in a spinlock operation by control of a cache memory.
In recent years, a cache memory is often mounted on a microcomputer in order to increase processing speed. When a processor accesses a main storage via the cache memory, the processing speed at the time of a cache hit can be improved.
In a multi-processor system including a plurality of such processors, the processors are connected to a common bus via cache memories and can access data in a common memory connected to the common bus.
In a multi-processor, particularly, a symmetric multi-processor, it is important that processors perform processing while obtaining coherency in data among cache memories for data on a common memory as a common resource for the processors.
A protocol for maintaining coherency in data among cache memories is called a cache coherency protocol which is largely classified to an invalidation protocol and updating protocol. In a relatively small-sized multiprocessor having not more than tens of processors, the invalidation protocol in which the configuration is relatively simple is often employed.
Representative cache coherency protocols of the invalidation type include a write-once protocol and the MESI protocol. In the invalidation-type protocols, use of a snoop cache of an instruction data separation type having a bus snoop function is a precondition. When a write miss in a cache occurs, by invalidating a cache line in another cache which is snoop-hit, coherency between the caches can be maintained.
In a multiprocessor system, a plurality of processors perform processes in parallel while obtaining synchronization. For exclusive control and synchronization control among processes and threads, exclusive control using a lock is necessary.
In a single-processor system, an exclusive control among processes/threads for a critical section can be easily realized by inhibiting an interruption during the critical section. In a multiprocessor system, however, even when an interruption is inhibited, there is the possibility that another processor executes a critical section. Consequently, it is insufficient to just inhibit an interruption but is necessary to perform a lock process among processes/threads.
The lock process is a process of locking a common resource, after a processor obtains the lock, executing a critical section, accessing the common resource, and unlocking the common resource. A spinlock is generally used in such a lock process and is a lock where a processor trying to obtain a lock performs a busy loop (spin) in a lock wait state, thereby obtaining a lock at high speed.
Related arts include the inventions disclosed in the following patent documents 1 to 3 and techniques disclosed in non-patent documents 1 and 2.
Patent document 1 is directed to reduce waste in power consumption and processor resources caused by a spin loop for exclusive control among a plurality of logical or physical processors. To monitor a shared variable [A] for the exclusive control, a load-with-lookup instruction for setting a trigger to start monitoring a trial of loading a target shared variable [A] and a store event is provided. A CPU issues the load-with-lookup instruction after failure in acquisition by CAS[A], monitors storage to a lock variable [A] (free access from another CPU), shifts to a suspend state in response to a suspend instruction, recovers using, as a trigger, detection of possibility of storage of the lock variable [A] from another CPU, and can try re-acquisition of the lock variable [A]. Therefore, a useless spin loop (idling) can be prevented.
Patent document 2 is directed to provide a multi-thread controlling apparatus and method capable of efficiently switching a plurality of threads in a multi-thread processor capable of executing a plurality of threads. The multi-thread controlling apparatus has a plurality of thread processing means. By executing a synchronous lock control such that, in the case where, during execution of certain thread processing means, a specific block in a cache is updated by another processor or another thread processing means, it is regarded that the right to exclusion for the thread processing means is open, a plurality of threads are efficiently switched.
Patent document 3 is directed to provide a semiconductor integrated circuit device capable of reducing power consumption of a CPU in a loop state and maintaining high-performance process without influencing the performance of the CPU which is performing processing. In a multi-processor system employing a spinlock as a system for performing an exclusive control between CPUs, a spinlock detector is coupled to each of first and second CPUs. When a spinlock state is detected by the spinlock detector, an inversion spinlock flag bar SLF0 and a bar SLF1 are output and supplied to two AND circuits. To the two AND circuits, memory access request signals RQ0 and RQ1 are also supplied. Outputs M0 and M1 of AND operation with the inversion spinlock flag bar SLF0 and the bar SLF1 are supplied to two cache memories.
The non-patent documents 1 and 2 relate to reduction in power consumption in a spin wait state in a spinlock. By inserting a pause instruction into a spin wait loop, a very small delay is inserted in the spin wait loop. It suppresses excessive operation of a hardware resource of the processor during a loop wait state, so that power consumption during execution of a spin loop is reduced.
By the pause instruction, a processor is notified of a hint that a spin wait loop is being executed so that simultaneous issue of a plurality of memory accesses and execution of out-of-order process are suppressed. Consequently, an accurate read access order of lock variables is assured and, by reducing hardware resources of processors operating simultaneously, power consumption is suppressed.