1. Field of the Invention
The present invention relates to an information processing device having a plurality of processors and a plurality of cache memories, and a control method thereof.
2. Related Background Art
In a conventional parallel computer system, each of processors is in general associated with a cache memory for a high-speed response to an access demand from the processor to a main memory and for reducing traffics of a mutual coupling network. Memory access from each processor is achieved via the cache memory, and the cache memory holds copies of data blocks as being memory access objects. In the parallel computer system, it is possible to take place that copies of the same data block exist in the cache memories. For ensuring consistency among those copies, various methods have been proposed.
In the parallel computer system using what can monitor all the transaction, such as buses, in the coupling network mutually connecting between the processors and between the processors and the main memory, the snoop method has been used in general. In the snoop method, each cache memory monitors all the transaction sent on the coupling network and, when a copy of a data block as an object of the transaction exists in the own cache memory, it performs a given necessary consistency holding operation.
On the other hand, in the parallel computer system using what can not monitor all the transaction, in the coupling network mutually connecting between the processors and between the processors and the main memory, the directory method has been used. In the directory method, caching data about which of cache memories includes a copy of data is stored and managed in a storage device, called a directory, per data block unit or per unit similar to that and, upon issuance of a transaction from the processor, an occurrence of the transaction is notified to the cache memory having a copy of transaction object data block based on the caching data obtained from the directory, so as to hold consistency among the copies.
As described above, the operation for achieving the consistency among the copies existing in a plurality of the cache memories in the conventional parallel computer system has been performed per transaction.
However, this is not suitable for loose memory consistency models which have been proposed for suppressing a latency of access to the memory. In general, in the loose memory consistency model, a synchronous point is predetermined in the course of process and, when the process reaches the synchronous point, it is obligated that the memory transactions issued up to then are reflected on the system. This means that it is not necessary to reflect memory transaction results prior to the synchronous point. Specifically, when using the conventional cache consistency holding technique in the parallel computer system employing the loose memory consistency model, a consistency holding operation occurs, per transaction, which is not necessary at that time point so that its overhead is against the object of the loose memory consistency model and inadvertently increases the memory access latency.