Individual processor speed continues to increase with new technology. Greater performance is also attainable by using clusters of nodes with multiple processors. For example, database systems often distribute portions of a database across several nodes in a cluster in order to improve performance and provide scalability. The use of multiple nodes requires methods for sharing data between nodes. Clusters may be configured as coherent memory clusters or compute clusters.
Nodes on a coherent memory cluster share physical memory. Sharing physical memory allows each node on the cluster to communicate very quickly. To send and receive messages between two nodes on a shared memory cluster, one node will write data to the shared memory and the other node will read the data from the shared memory. However, coherent memory clusters are expensive and the size of the shared memory is limited.
Nodes on a compute cluster do not share physical memory. Communication between nodes on a compute cluster may be performed through messaging. Furthermore, compute nodes may need to reassemble incoming messages and store the reassembled messages in a node's main memory. Typically, nodes on a compute cluster communicate over a common bus, such as to access memory local to another node. One drawback of a shared bus architecture is that the common bus becomes a performance-limiting element as internode communications queue up and compete for the use of the common bus. Once the common bus is saturated or nearly saturated, very little improvement in performance is achieved by adding additional nodes.
One technique to overcome the disadvantages of a shared bus architecture involves dedicated high-speed point-to-point communications links between node pairs. However, a complex, layered communications protocol is required to guarantee accurate, robust communication. At each node on a communication path, an interface processor must carry out this complex protocol as well as translate and validate the source and destination addresses. Performing these communications tasks lowers performance because the interface processors are generally much slower than the main CPU, and further because the coupling between the interface and the respective node's memory is poor. Thus, performance is also limited using a shared bus architecture.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.