Networking has become an integral part of computer systems. To improve networking performance, datacenter server clusters used in high performance computing (HPC) generally aim for a communication interconnect technology that can support two goals: (1) deliver bulk data with high throughput and low processor utilization, and (2) deliver other packets with lower latency than the bulk data.
These two goals can be at odds. In particular, optimizations for increasing throughput may add latency by batching packet transfers, for example, to increase efficiency. On the other hand, low latency optimizations may require packets to be processed immediately, which may in turn disallow the packets from being batched and, more specifically, disallowing the packets from being queued behind existing batches.
One throughput optimization technique that is used in some Ethernet controllers is TCP (transmission control protocol) segmentation offloading (TSO) which attempts to unburden the server by moving processor-intensive networking tasks off the server and onto the adapter card. This technique may increase the throughput in some situations; however, it may also make it more difficult for low latency packets to bypass larger operations associated with TSO.