High performance computing systems based on multi-core architecture have multi-core integrated circuit dies (chips) connected by a network infrastructure. Communications between/among the processes running on the cores occur both within a node (intra-node communication) and between the nodes (inter-node communication). Message Passing Interface (MPI) is a communication protocol used for process communications, for example, in parallel programming. MPI provides collective operations used for synchronization and communication among processes. Software that implement MPI in high performance computing systems utilize the network technology for communicating between processes that reside on different physical nodes, while using shared memory for communicating between processes on different cores within the same node.
As the chip technology becomes more complex, for example, as more and more cores are allocated on a chip in the current multi-core architecture, maintaining of communication and coherence among the cores within the chip as well as outside the chip require additional work and become burdensome to shared resources.