The present disclosure relates generally to messaging between processing modules, and more particularly, the management of Remote Direct Memory Access (RDMA) in a multiple processor environment.
Direct memory access (DMA) is a feature of computers and microprocessors that allows certain hardware subsystems within the computer or microprocessor to access system memory for reading and/or writing independently of the central processing unit (CPU). Many hardware systems use DMA including but not limited to disk drive controllers, graphics cards, network cards, sound cards and graphics processing units (GPUs). DMA is also used for intra-chip data transfer in multi-core processors, especially in multiprocessor system-on-chips, where the processing elements may be equipped with a local memory and DMA is used for transferring data between the local memory and the main memory.
Remote Direct Memory Access (RDMA) allows data to move directly from the memory of one computer or microprocessor into that of another without involving the operating system. This permits high-throughput, low-latency networking, which is especially useful in massively parallel computer clusters. RDMA relies on a special philosophy in using DMA.
Message passing is a common feature between two cooperating computing systems. Asynchronous message passing is preferable for performance reasons. To support a full-duplex asynchronous messaging model, a queue, common to both message passing participants, is often used. In order to support full-duplex, both participants may manage the shared queue, which requires locking. Even though locking can be a performance hindrance, locking is arguably better than the complexity and overhead required for one participant to manage all queue operations.
Supporting a shared queue is more complicated in a heterogeneous memory system. Queue elements cannot simply be linked together with a simple linked list as the element pointers only translate in one of the two memory domains. This also means that a more complicated method must be used for managing the queue element order.