Remote Direct Memory Access (RDMA) is a direct memory access mechanism that enables a computer to access memory from another computer without involving the computers' operating systems. RDMA supports zero-copy networking by enabling a network adapter to transfer data directly to or from application memory, eliminating the need to copy data between application memory and the data buffers in the operating system. Such transfers require no work to be done by CPUs, caches, or context switches, and transfers continue in parallel with other system operations. When an application performs an RDMA Read or Write request, the application data is delivered directly to the network, reducing latency and enabling fast message transfer.
Current RDMA-enabled network adapters (such as Internet Wide Area RDMA Protocol (iWARP) RDMA Network Interface Controllers (RNICs) or InfiniBand HCAs (Host Channel Adapters) use uncached Memory-mapped input/output (MMIO) writes to the memory mapped adapter address space to notify hardware about posted transmit or receive work queue elements (WQEs). Those MMIO write transactions are called Doorbell Rings (DB Rings). Both InfiniBand and iWARP allow application to communicate with hardware directly from the application address space. This is enabled by supporting numerous hardware queues—Send Queues (SQ) and Receive Queues (RQ) that can be mapped and directly accessed from the application address space. Every time an application posts a new transmit or receive work request (WR), this request is added to the respective SQ or RQ by the user space library supplied by the hardware provider.
Although both iWARP and InfiniBand semantically allow applications to post multiple WRs with a single request, in real deployment scenarios applications rarely use this capability, and frequently provide one WR at a time. Every new WR posted to the hardware queue is usually accompanied by a DB Ring to notify hardware that a new request has been added to the queue. RDMA-enabled network adapters are fairly complex and have to maintain various hardware constructs to allow them to keep track of the state of various hardware resources, such as Queue Pair context (pair of Send Queue and Receive Queue), Memory Region Context, Page Lists, etc. With an increasing number of hardware queues, and other resources, and transition of hardware solutions toward less expensive solutions, many RDMA NICs are migrating toward keeping hardware constructs in the host memory and caching most frequently used, rather than keeping all hardware resources on dedicated on-chip or on-card memories. Increasing processing rate capabilities of RDMA NICs and migration of hardware resources to the host memory make frequent DB Rings followed by hardware construct updates a significant burden to the host platform interface (e.g., a PCIe interface). As a result, reduction or elimination of DB Rings becomes a very important factor in improving performance and WR processing capabilities of RDMA NICs.