In a typical cloud-based data center, a large collection of interconnected servers provides computing and/or storage capacity for execution of various applications. For example, a data center may comprise a facility that hosts applications and services for subscribers, i.e., customers of the data center. The data center may, for example, host all of the infrastructure equipment, such as compute nodes, networking and storage systems, power systems, and environmental control systems. In most data centers, clusters of storage systems and application servers are interconnected via a high-speed switch fabric provided by one or more tiers of physical network switches and routers. Data centers vary greatly in size, with some public data centers containing hundreds of thousands of servers, and are usually distributed across multiple geographies for redundancy. A typical data center switch fabric includes multiple tiers of interconnected switches and routers. In current implementations, packets for a given packet flow between a source server and a destination server or storage system are always forwarded from the source to the destination along a single path through the routers and switches comprising the switching fabric.
Conventional compute nodes hosted by data centers typically include components such as a central processing unit (CPU), a graphics processing unit (GPU), random access memory, storage, and a network interface card (MC), such as an Ethernet interface, to connect the compute node to a network, e.g., a data center switch fabric. Typical compute nodes are processor centric such that overall computing responsibility and control is centralized with the CPU. As such, the CPU performs processing tasks, memory management tasks such as shifting data between local caches within the CPU, the random access memory, and the storage, and networking tasks such as constructing and maintaining networking stacks, and sending and receiving data from external devices or networks. Furthermore, the CPU is also tasked with handling interrupts, e.g., from user interface devices. Demands placed on the CPU have continued to increase over time, although performance improvements in development of new CPUs have decreased over time. General purpose CPUs are normally not designed for high-capacity network and storage workloads, which are typically packetized. In general, CPUs are relatively poor at performing packet stream processing, because such traffic is fragmented in time and does not cache well. Nevertheless, server devices typically use CPUs to process packet streams.