Datacenter applications are rapidly evolving from simple data-serving tasks to sophisticated analytics operating over enormous datasets in response to real-time queries. To minimize the response latency, datacenter operators keep the data in memory. As dataset sizes push into the petabyte range, the number of servers required to house them in memory can easily reach into hundreds or even thousands.
Because of the distributed memory, applications that traverse large data structures (e.g., graph algorithms) or frequently access disparate pieces of data (e.g., key-value stores) must do so over the datacenter network. As today's datacenters are built with commodity networking technology running on top of commodity servers and operating systems, node-to-node communication delays can exceed 100 microseconds (“μs”). In contrast, accesses to local memory incur delays of around 60 nanoseconds (“ns”)—a factor of 1000 less. The irony is rich: moving the data from disk to main memory yields a 100,000× reduction in latency (10 milliseconds (“ms”) vs. 100 ns), but distributing the memory eliminates 1000× of the benefit.
The reasons for the high communication latency are well known and include deep network stacks, complex network interface cards (“NIC”), and slow chip-to-NIC interfaces. Remote direct access memory (“RDMA”) reduces end-to-end latency by enabling memory-to-memory data transfers over InfiniBand and Converged Ethernet fabrics. By exposing remote memory at user-level and offloading network processing to the adapter, RDMA enables remote memory read latencies as low as 1.19 μs; however, that still represents a >10× latency increase over local dynamic random-access memory (“DRAM”).
Thus, improvements would be desirable.