Some high-performance computer systems employ network interface controllers (NIC) that enable low-latency, CPU-efficient communication between machines. Remote Direct Memory Access (RDMA) technology, in particular, is one technique wherein a NIC hardware device implements sufficient functionality to enable one machine to efficiently access the memory of another machine without incurring the overhead of general-purpose software networking stacks that run on time-shared host processors. NICs with this functionality are sometimes referred to as “Smart NICs” because they implement significantly more functionality than a generic NIC that simply sends and receives host-generated packets. The functions of a Smart NIC include a reliable transport in the NIC as well as a suite of “Verbs” that implements specific functions such as RDMA Read and Write.
General-purpose host networking stacks, running on conventional NICs, are unable to provide the same low latency guarantees as dedicated hardware-based Smart NICs. First, kernel-based networking stacks are primarily designed to time-share cores with other applications and services. Time-sharing a processor will inevitably result in latency impact due to interrupt handling and context switching with unpredictable scheduler delays. Second, time-shared networking stacks must be invoked through the use of an operating system call which has non-trivial overhead. Third, conventional host networking stacks do not implement Verbs in order to execute well-defined functions, such as RDMA Read and Write, inline with the transport. Instead they must dispatch an incoming request to an application thread, and the act of dispatching an application thread incurs significant CPU overhead and latency.
Hardware-based Smart NICs have several problems. First they are inflexible to change. It is either difficult or impossible to extend the NIC with new Verbs, new congestion control algorithms, or with bug fixes. Second, hardware-based Smart NICs often have other requirements on the surrounding systems such as requiring a lossless network fabric.