A distributed data store is generally a computer network in which data is stored on a plurality of data storage systems. Each data storage system may be referred to as a node. For example, a data volume may be striped across multiple solid state drives (SSDs) in each node and across multiple nodes, and read and write operations from one node may be redirected to another node. Distributed data stores are highly scalable and less costly to maintain because the nodes can be easily added, removed, or replaced. As a result, distributed data stores are often used in data centers.
Although distributed data stores have been implemented using traditional Ethernet networks and traditional client-server architectures, such distributed data stores often suffered from high latency unless sufficient server CPU cores were provisioned to satisfy peak loads. This meant that server CPUs needed to be upgraded continually over time to satisfy increasing load demands, or the server clients needed to wait for additional server CPU resources to be spun up during load spikes.
Remote direct memory access (RDMA) offers a solution to the high latency problem of traditional Ethernet networks and traditional client-server architectures. RDMA allows direct memory access from the memory of one computer into that of another without involving either one's operating system. This permits high-throughput, low-latency networking, which is especially useful in large, parallel computer clusters. In view of these benefits, RDMA-enabled distributed data stores are increasingly being adopted.
Typically, when an application running on a local node performs a RDMA read or write operation on a remote node, the application would have to specify a target address of an RDMA buffer at the remote node. The RDMA buffer is where the local node would read from or write to in a remote RDMA operation. However, because each application running on each node manages its own buffer memory space and decides which portion thereof is available for use as an RMDA buffer, an RDMA buffer registration process is performed between the local node and the remote node before every remote RDMA operation. During the buffer registration process, an application running on the remote node allocates memory space for an RDMA buffer, and a target address of the RDMA buffer is shared with the local node. The remote node may share the target address with the local node via internode communication, for example, by exchanging small messages via RDMA.
This means that traditional applications using RDMA for remote node communication may encounter latency issues as a result of the RDMA buffer registration process. As a result, these latency issues may adversely affect the I/O performance of the distributed data store.