1. Field of the Invention
The present invention relates to a loosely coupled multiprocessor system, and more particularly to the maintenance of coherency between the data stored in main and cache memories in such a multiprocessor system.
2. Description of the Related Art
One technology for maintaining coherency between the data stored in main and cache memories in a conventional loosely coupled multiprocessor system is disclosed in xe2x80x9cThe Directory-Based Cache Coherence Protocol for the DASH Multiprocessorxe2x80x9d by Daniel Lenoski, James Laudon, Kourosh Gharachorloo, Anoop Gupta and John Hennessy, In Proceedings of 17th International Symposium on Computer Architecture, pages 148-159, 1990.
FIG. 1 of the accompanying drawings shows in block form an arrangement of such a conventional loosely coupled multiprocessor system.
As shown in FIG. 1, the conventional loosely coupled multiprocessor system comprises a plurality of nodes Pe0-Penxe2x88x921 and two interconnection networks 101, 102 that interconnect the nodes.
Each of the nodes, denoted by Pei in FIG. 1, comprises a processor 50 for performing processing and memory access, a main memory 51, a cache memory 52 that can be accessed at a higher speed than the main memory 51, and a coherency maintenance controller 53 for maintaining coherency between the data stored in the main memory 51 and the cache memory 52 (and those of the other nodes). The processor 50 temporarily stores data in the main memory 51.
The coherency maintenance controller 53 holds the state of data stored in the main memory 51 and information of nodes which hold a copy of data in the cache memory 52 (hereinafter referred to as xe2x80x9cholding node informationxe2x80x9d). There are two states of data, i.e., states C and M. The state C is a state in which a copy of data is present in the cache memories 52 of a plurality of nodes. In this case, the value of the copy of data present in the cache memory 52 and the value of data stored in the main memory 51 are the same with each other. The state M is a state in which only the cache memory 52 of one node holds a copy of data. In this case, the value of the copy of data present in the cache memory 52 and the value of data stored in the main memory 51 are different from each other, and the value of the copy of data present in the cache memory 52 is the latest value.
The coherency maintenance controller 53 also holds the state of data stored in the cache memory 52 and a tag address of the data. There are three states of data, i.e., states I, S, and D. The state I is a state in which there is no effective copy of data with maintained coherency. The state S is a state in which there is a possibility that there is an effective copy of data and there is also an effective copy of data in the cache memory 52 of another node. The state D is a state in which there is an effective copy of data, there is no effective copy of data in the cache memory 52 of another node, and the value of the data is different from the value of the data stored in the main memory 51. The tag address indicates at which address the data stored in the cache memory 52 is located.
The interconnection network 101 distributes request messages exchanged between the nodes, and the interconnection network 102 distributes reply messages exchanged between the nodes. The interconnection network for distributing request messages and the interconnection network for distributing reply messages, which are separate from each other, are effective to avoid deadlock in maintaining coherency between the data stored in the main memory 51 and cache memory 52.
A process for maintaining coherency between the data stored in the main memory 51 and cache memory 52 in the multiprocessor system when the processor 50 performs a load or store access to data at a given address will be described below.
First, it is assumed that the processor 50 at the node Pe1 performs a load access.
The coherency maintenance controller 53 checks if an effective copy of the data at the corresponding address is present in the cache memory 52 or not. If an effective copy of the data is present in the cache memory 52, i.e., if the data is in the state S or D, then the coherency maintenance controller 53 replies to the processor 50 by transferring the data read from the cache memory 52 to the processor 50, after which the process comes to an end.
If an effective copy of the data is not present in the cache memory 52, i.e., if the data is in the state I, then the coherency maintenance controller 53 at the node Pe1 transmits a request message to read the data to a node which holds the data at the corresponding address, e.g., the node Peh, through the interconnection network 101.
In response to the reading request message, the coherency maintenance controller 53 at the node Peh checks if the latest value of the data at the corresponding address is present in the main memory 51 at the node Peh. If the latest value of the data at the corresponding address is present in the main memory 51, i.e., if the data is in the state C, then the coherency maintenance controller 53 at the node Peh transmits the data stored in the main memory 51 to the node Pe1 through the interconnection network 102, and adds the node Pe1 to the holding node information.
Upon reception of the data from the node Peh, the coherency maintenance controller 53 at the node Pe1 transfers the received data to the processor 50, and copies the data to the cache memory 52. The coherency maintenance controller 53 at the node Pe1 sets the state of the data to the state S.
At the node Peh which has received the reading request message, if the latest value of the data at the corresponding address is not present in the main memory 51, i.e., if the data is in the state M, then the coherency maintenance controller 53 at the node Peh refers to the holding node information, and transmits the reading request message to a node which holds the latest data, e.g., the node Per, through the interconnection network 101.
At the node Per which has received the reading request message, the coherency maintenance controller 53 checks if the data in the state D is present in the cache memory 52 or not. If the data in the state D is present in the cache memory 52, then the coherency maintenance controller 53 at the node Per transmits the data stored in the cache memory 52 to the node Pe1 through the interconnection network 102, and also transmits a writing request message with the data stored in the cache memory 52 being added thereto to the node Peh through the interconnection network 101. The coherency maintenance controller 53 at the node Per updates the state of the data present in the cache memory 52 to the state S.
In response to the writing request message, the coherency maintenance controller 53 at the node Peh updates the data in the main memory 51 to the data added to the writing request message. The coherency maintenance controller 53 also updates the state of the data to the state C and adds the node Pe1 to the holding node information.
At the node Per which has received the reading request message, if the data in the state D is not present in the cache memory 52, then the coherency maintenance controller 53 at the node Per transmits a Nak (negative acknowledge) message to the node Pe1 through interconnection network 102.
In response to the Nak message, the coherency maintenance controller 53 at the node Pe1 transmits the reading request message again to the node Peh. Subsequently, the same process is repeated until data is transmitted to the node Pe1 and transferred to the processor 50 at the node Pe1.
Now, it is assumed that the processor 50 at the node Pe1 performs a store access.
The coherency maintenance controller 53 checks if a copy of the data at the corresponding address, which is only one copy in the system, is present in the cache memory 52 or not. If an effective copy of the data is present in the cache memory 52, i.e., if the data is in the state D, then the coherency maintenance controller 53 updates the data in the cache memory 52, notifies the processor 50 of an access completion, after which the process comes to an end.
If only one copy of the data is not present in the cache memory 52, i.e., if the data is in the state I or S, then the coherency maintenance controller 53 at the node Pe1 transmits an exclusive reading request message to the node which holds the data at the corresponding address, e.g., the node Peh, through the interconnection network 101.
In response to the exclusive reading request message, the coherency maintenance controller 53 at the node Peh checks if the latest value of the data at the corresponding address is present in the main memory 51 at the node Peh. If the latest value of the data at the corresponding address is present in the main memory 51, i.e., if the data is in the state C, then the coherency maintenance controller 53 at the node Peh transmits the data stored in the main memory 51 to the node Pe1 through the interconnection network 102.
If a node other than the node Pe1 holds a copy of the data in the cache memory 52 thereof, then the coherency maintenance controller 53 at the node Peh transmits an invalidating request message through the interconnection network 101 to all nodes (referred to as nodes Pek) other than the node Pe1 where a copy of the data is present. The coherency maintenance controller 53 at the node Peh also updates the state of the data in the main memory 51 to the state M, and sets the holding node information to the node Pe1 only. To the data transmitted to the node Pe1 is added the number of nodes Pek to which the invalidating request message is transmitted.
At the nodes Pek which have received the invalidating request message, the coherency maintenance controller 53 updates the state of the data in the cache memory 52 to the state I, and transmits an Ack (positive acknowledge) message to the node Pe1 through interconnection network 102.
At the node Pe1 which have received the data from the node Peh, the coherency maintenance controller 53 waits for as many Ack messages as the number of nodes Pek which has been added to the data. When the coherency maintenance controller 53 at the node Pe1 has received as many Ack messages as the number of nodes Pek, the coherency maintenance controller 53 updates the data in the cache memory 52 to the data of the store access performed by the processor 50. The coherency maintenance controller 53 at the node Pe1 updates the state of the data to the state D, notifies the processor 50 of an access completion, after which the process comes to an end.
At the node Peh which has received the exclusive reading request message, if the latest value of the data at the corresponding address is not present in the main memory 51, i.e., if the data is in the state M, then the coherency maintenance controller 53 at the node Peh refers to the holding node information, and transmits the exclusive reading request message to a node which holds the latest data, e.g., the node Per, through the interconnection network 101.
At the node Per which has received the exclusive reading request message, the coherency maintenance controller 53 checks if data of the state D is present in the cache memory 52. If no data of the state D is present in the cache memory 52, then the coherency maintenance controller 53 at the node Per transmits a Nak message through the interconnection network 102.
In response to the Nak message, the coherency maintenance controller 53 at the node Pe1 transmits the exclusive reading request message again to the node Peh. Subsequently, the same process is repeated.
If data of the state D is present in the cache memory 52, then the coherency maintenance controller 53 at the node Per transmits the data stored in the cache memory 52 to the node Pe1 through the interconnection network 102. The coherency maintenance controller 53 at the node Per also transmits a holding node updating request message to the node Peh through the interconnection network 101 and updates the state of the data in the cache memory 52 to the state I.
At the node Peh which has received the holding node updating request message, the coherency maintenance controller 53 updates the holding node information as representing that only the node Pe1 holds the data of the main memory 51, and transmits an Ack message to the node Pe1 through the interconnection network 102.
At the node Pe1 which has received the data from the node Per, the coherency maintenance controller 53 waits for the reception of the Ack message from the node Peh. Upon the reception of the Ack message from the node Peh, the coherency maintenance controller 53 updates the data in the cache memory 52 to the data of the store access performed by the processor 50. The coherency maintenance controller 53 at the node Pe1 updates the state of the data to the state D, notifies the processor 50 of an access completion, after which the process comes to an end.
The conventional multiprocessor system has a problem in that the processing for maintaining coherency may occasionally enter an infinite loop. For example, when the processor 50 of the node Pe1 performs a data access, a Nak message from the node Per to the node Pe1 is repeated. Therefore, the conventional multiprocessor system may encounter a situation where a reply cannot be sent to the processor 50 within a limited period of time.
In the conventional multiprocessor system, furthermore, deadlock has been avoided by separating employing the interconnection network 101 which exchanges request messages and the interconnection network 102 which exchanges reply messages. For this reason, the conventional multiprocessor system suffers a high hardware cost, a high failure rate, and a low system reliability level.
It is an object of the present invention to provide a multiprocessor system which can ensure the completion of data access by a processor while maintaining coherency between the data stored in a main memory and a cache memory.
Another object of the present invention is to provide a multiprocessor system which does not need an additional hardware arrangement for the avoidance of deadlock, and is low in cost and high in reliability.
A multiprocessor system according to the present invention has a plurality of nodes and an interconnection network interconnecting the nodes.
Each of the nodes has a main memory for storing data, a cache memory for storing part of the data stored in the main memory in any one of the nodes, the cache memory being accessible faster than the main memory, cache state storage means for storing a state of the data stored in the cache memory, and main memory state storage means for storing a state of coherency of the data stored in the main memory and the data stored in the cache memory.
Each of the nodes also has local access control means and home access control means.
If an access request from a processor is of predetermined contents and the state of the data stored in the cache state storage means is a predetermined state, the local access control means sends the access request from the processor to a node having the main memory which stores data corresponding to the access request.
If an access request from another one of the nodes is of predetermined contents and the state of the data stored in the main memory state storage means of the node is a predetermined state, the home access controlling means makes a coherency request to cause a node represented by the information stored in the main memory state storage means to effect a process to maintain coherency of the data.
The local access control means also effects a process to maintain coherency of the data in the cache memory according to a coherency request from another one of the nodes, and sends a first reply with respect to a completion of the process to maintain coherency to a node having the main memory whose stored data is subjected to the process to maintain coherency.
If the first reply sent from the local access controlling means in any one of the nodes is of predetermined contents and the state of the data stored in the main memory state storage means of the node is a predetermined state, then the home access control means effects a process to maintain coherency of the data in the main memory, and sends a second reply with respect to the completion of the process to maintain coherency to the node having the processor which has made the access request.
If a second reply sent from another one of the nodes is of predetermined contents and the state of the data stored in the cache state storage means is a predetermined state, then the local access control means effects a process to maintain coherency of the data in the cache memory, and sends a third reply with respect to a completion of the process to maintain coherency to the processor.
Each of the nodes further comprises first arbitrating means for arbitrating between the access request issued by the processor and the coherency request and the second reply sent by the home access controlling means according to contents thereof, and enabling the home access controlling means to execute the access request, the coherency request, and the second reply.
Each of the nodes further comprises second arbitrating means for arbitrating between the access request and the first reply sent by the local access controlling means according to contents thereof, and enabling the local access controlling means to execute the access request and the first reply.
The state of coherency of the data stored in the main memory state storage means includes a state representing that the process to maintain coherency is being effected. Each of the nodes further comprises first main memory state updating means, access request saving means, second main memory state updating means, and access request returning means.
The first main memory state updating means updates the state of the data stored in the main memory state storage means to the state representing that the process to maintain coherency is being effected if an access request from the local access controlling means in any one of the nodes is of predetermined contents and the state of the data stored in the main memory state storage means of the node is a predetermined state.
The access request saving means saves an access request if the access request is of predetermined contents and the state of the data stored in the main memory state storage means is the state representing that the process to maintain coherency is being effected.
The second main memory state updating means updates the state of the data stored in the main memory state storage means to a state which is not the state representing that the process to maintain coherency is being effected if a first reply sent from the local access controlling means in any one of the nodes is of predetermined contents and the state of the data stored in the main memory state storage means of the node is a predetermined state.
The access request returning means returns an access request saved by the access request saving means and enables the home access controlling means to process the returned access request if the second main memory state updating means has updated the state of the data stored in the main memory state storage means to the state which is not the state representing that the process to maintain coherency is being effected.
Each of the nodes further comprises access request holding means, conflict request storage means, and access request re-processing means.
The access request holding means holds an access request sent from the processor if the local access controlling means sends an access request to a node having the main memory which stores data corresponding to the access request according to the access request sent from the processor.
The conflict request storage means stores a request conflict if a coherency request is of predetermined contents, and an access request held by the access request holding means is of predetermined contents when the local access controlling means has effected a process to maintain coherency of the data in the cache memory according to the coherency request sent from the home access controlling means in another one of the nodes.
The access request re-processing means enables the local access controlling means to re-process the access request held by the access request holding means if the second reply is of predetermined contents, the state of the data stored in the cache state storage means is of predetermined contents, and the request conflict stored in the conflict request storage means is of predetermined contents, when the local access controlling means has. effected a process to maintain coherency of the data in the cache memory according to a second reply sent from the home access controlling means in another one of the nodes.
Each of the nodes further comprises reply accumulating means, coherency request accumulating means, and third accumulating means.
The reply accumulating means accumulates second replies sent from the home access controlling means in either one of the nodes to the local home access controlling means in the either one of the nodes.
The coherency request accumulating means accumulates coherency requests sent from the home access controlling means in either one of the nodes to the local home access controlling means in the either one of the nodes.
The third accumulating means accumulates either coherency requests or second replies sent from the home access control means through the interconnection network to another one of the nodes.
In the multiprocessor system, the local access control means arbitrates a memory access from the processor and a message for coherency maintenance control in the coherency request accumulating means according to reply messages accumulated in the reply accumulating means. The main memory state storage means stores the state of data stored in the main memory, which includes a state representing that the process of maintaining coherency is being carried out. When the state stored in the main memory state storage means indicates that the process of maintaining coherency is being carried out on data corresponding to an access request, the home access controlling means saves the access request in the main memory.
In the multiprocessor system according to the present invention, the process that is carried out for maintaining coherency of the data stored in the main memory and the data stored in the cache memory will not enter an infinite loop. The multiprocessor system ensures that the processor will obtain the result of a memory access within a finite period of time.
It is not necessary to add a hardware arrangement, particularly interconnection networks, to the multiprocessor system for the avoidance of deadlock. Consequently, the multiprocessor system according to the present invention is relatively highly reliable and low in cost.
The above and other objects, features, and advantages of the present invention will become apparent from the following description with reference to the accompanying drawings which illustrate examples of the present invention.