Multiprocessor, or parallel processing, computer systems rely on a plurality of microprocessors to handle computing tasks in parallel to reduce overall execution time. Originally, many multiprocessor systems interconnected a plurality of processors to a single bus, commonly referred to as a multidrop bus, which required each processor to connect up with every line in the bus.
However, it has been found that bus traffic may be limited because every communication between processors passes through the common bus. Moreover, it may be desirable to provide different processors on different boards or in different enclosures, which often makes it impractical to couple each processor to a common bus.
Because of these drawbacks, networked multiprocessor systems have also been developed, which utilize processors or groups of processors connected to one another across a network and communicating via "packets" or messages. Each processor or group of processors in these systems operates as a "node" in a network, and is only connected to one or more other nodes such that messages may be required to pass through one or more nodes to reach their destination. Among other benefits, the connection of nodes and the addition of new nodes is greatly facilitated since a node need only be connected to one or more other nodes in the network. Also, network bandwidth increases since different packets may be simultaneously transmitted between different sets of nodes.
While much of the focus herein will be devoted to multiprocessor systems, it should be appreciated that the concept of distributing tasks between processors in multiprocessor systems may also be applied to distributed computer systems which distribute tasks between different computers in a networked environment (e.g., a LAN or WAN). Further, many of the functions and problems associated with multiprocessor and distributed computer systems are quite similar and equally applicable to both types of systems. Consequently, the term "networked computer system" will be used hereinafter to describe both systems in which the nodes are implemented as individual microprocessors or groups of processors (multiprocessor systems) or as individual computers which may separately utilize one or more processors (distributed computer systems).
One specific example of a networked multiprocessor system is the DASH (Directory Architecture for Shared memory) multiprocessor system, which includes a plurality of nodes or clusters interconnected via a mesh network. Each node includes a plurality of processors coupled to a local bus, along with a portion of the shared memory for the system. A network interface in each node handles memory requests to other nodes, and a directory in each node is used to monitor the status of local copies of the memory stored in caches in various processors in the system.
One significant problem that exists with many networked computer systems is that of deadlock, where nodes may in effect "lock up" due to an inability to pass packets or messages to other nodes. For example, a node, after sending a request to another node, may be required to wait for a response from the other node. While waiting for the response, the node may not be capable of receiving requests from other nodes, thereby halting the operation of other nodes in the system.
The DASH system addresses this problem (referred to as request-reply deadlock) by classifying packets in different classes (one for requests and one for responses), and by having separate networks, buffers and processing circuitry for handling the different packet classes. Consequently, even though a node may be waiting for a response, it may still process requests that are received while it is waiting for the response.
However, it has been found that another form of deadlock, request--request deadlock, may also occur in multiprocessor systems. In particular, in some networked computer systems, sending a primary request to another node may result in the receiving or destination node sending out one or more secondary requests, e.g., to notify other nodes having local copies of a memory block that the block has been modified (often referred to as "invalidation" requests). However, if one or more secondary requests cannot be sent by the destination node, the node may block receipt of new requests from other nodes. This may result in two nodes each waiting for responses from the other.
For example, FIG. 1 shows a networked computer system with three nodes A, B and C (numbered 2, 4 and 6). If node A sends a primary request to node B, node B may send a secondary request to node C. Before node B can send a response to node A, however, it must wait for a response from node C. If, however, node C has also sent a primary request to node B, node B may refuse to accept the request from node C because it is still processing the node A request. But since node C has sent a request to node B, it may also not accept the secondary request from node B. At this point, both nodes B and C are waiting for a response from the other before they will accept and process the pending requests, thereby resulting in deadlock in the system, which it will be noted propagates to the other nodes in the system (e.g., node A, since it waits for a response from node B).
The DASH system addresses this problem by having nodes reject requests (send responses to the effect that the requests were refused and must be retried) whenever they are incapable of receiving requests. However, rejecting requests raises another problem that is somewhat unique to multiprocessor systems--that of ordering. In particular, it is extremely important in multiprocessor systems for commands or requests to complete in the order in which they were sent. Thus, if a string of requests is sent to a destination node, and one of the requests is rejected, the system cannot permit other requests that follow the rejected request to be processed before the rejected request, even if the destination node becomes able to process them.
The conventional manners of addressing this problem are to either send requests one at a time (i.e., waiting for a response before sending a next request), or by detecting rejected requests and subsequently rejecting each subsequent request in a chain to force the requesting node to resend all of the requests from the point of the first rejected request. The former, however, effectively serializes the communication between nodes, and significantly degrades performance. Further, the latter requires complicated logic for handling rejected requests at the destination node due to the requirement of identifying and isolating requests in a particular chain from a particular node.
Other networked computer systems attempt to address the request--request deadlock problem; however, to date, none have been capable of adequately ensuring proper ordering without the use of additional complex and performance-limiting processing. Therefore, a substantial need continues to exist for a networked computer system and method of communicating which avoids request--request deadlock without compromising system ordering or system performance.