Multi-node computer systems may be partitioned into domains, with each domain functioning as an independent machine with its own address space. An operating system runs separately on each domain. Partitioning permits the resources of a computer network to be efficiently allocated to different tasks, to provide flexibility in the use of a computer system, and to provide isolation between computer resources to limit the effects of hardware or software faults in one part of the network from interfering with the operation of the entire system. The domains are isolated from one another so that a domain cannot directly read from or write to the shared address space of another domain.
Conventional messaging mechanisms for passing messages between domains in a partitioned computer system are known. For example, conventional implementations perform messaging from an input/output (I/O) device in one domain (sending domain) to an I/O in another domain (receiving domain). This approach presents several disadvantages. First, it requires a direct memory access (DMA) read in a sending domain such that an I/O network interface controller reads data from memory in the sending domain. It further requires a DMA write such that an I/O network interface controller writes data to memory in a receiving domain. Each DMA transfer incurs an additional overhead of processor I/O accesses.
Second, since the messaging driver runs over a network protocol stack, round trip latency for short messages becomes quite long. Moreover, conventional implementation requires polling a hardware (H/W) write pointer register to indicate when valid data arrives in the receiving domain. Polling the H/W write pointer register generates transactions on processor interface that result in high bandwidth overhead. Furthermore, because fragmentation of messages occurs in network routers/switches, messages may arrive in fragments that are less than cache-line in size. Such transfers of data are inefficient because they waste bandwidth in an interconnect and increase overhead in memory.
Therefore, it is desirable to have a mechanism that would allow the system to pass cache-line size messages between domains. Further, it is desirable to provide an in-memory notification when valid data arrives in the receiving domain without generating transactions on processor interface.