Individual processing systems have greatly increased in performance. However, still greater performance is attainable by clusters of processing systems or nodes. A key factor in attaining high performance clusters is communication among the nodes. FIG. 1 depicts one early technique in which, the processing nodes 102, 104, 106, each with its own CPU 108a-c and local memory 110a-c, were coupled via interfaces 112a-c to each other by a common bus 110. Each node 102, 104, 106 was allowed to access the other nodes' memory, such that the processing nodes could be viewed as sharing one large memory. One drawback of this shared bus architecture was that the bus quickly became a performance limiting element, because all of the internode communications queued up, competing for the use of the bus. Once the bus 110 became saturated or nearly saturated, adding additional nodes provided very little improvement.
Recognizing the disadvantages of the shared bus architecture, another technique, depicted in FIG. 2, is employed. In FIG. 2, nodes 202, 204, 206, 208, 210, in the cluster comprising CPUs 216a-e and memories 218a-e, are interconnected by dedicated high-speed point-to-point communications links 220a-j. If enough point-to-point connections 220a-j are used, creating a fabric of links, higher performance is achieved, because there is no shared bus contention. However, the point-to-point communications links 220a-j adhere to a complex, layered communications protocol to guarantee correctness and robustness of the communication. The architecture requires that I/O processors in the interfaces 214a-t carry out this complex protocol as well as translate and validate the source and destination addresses. Performing these communications tasks lowers performance because the I/O processors are generally much slower than the main. CPU in carrying out the protocols and address translation and because the coupling between the interface and the respective node's memory was poor. Thus, while higher performance was achieved in the cluster, the communications overhead and poor coupling causes performance gain to reach an upper limit.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.