Access to computer networks has become a ubiquitous part of today's computer usage. Whether accessing a Local Area Network (LAN) in an enterprise environment to access shared network resources, or accessing the Internet via the LAN or other access point, it seems users are always logged on to at least one service that is accessed via a computer network. Moreover, the rapid expansion of cloud-based services has lead to even further usage of computer networks, and these services are forecast to become ever-more prevalent.
Expansion of network usage, particularly via cloud-based services, as been facilitated via substantial increases in network bandwidths and processor capabilities. For example, broadband network backbones typically support bandwidths of 10 Gigabits per second (Gbps) or more, while the standard for today's personal computers is a network interface designed to support a 1 Gbps Ethernet link. On the processor side, processors capabilities have been increased through both faster clock rates and use of more than one processor core. For instance, today's PCs typically employ a dual-core processor or a quad-core processor, while servers may employ processors with even more cores. For some classes of servers, it is common to employ multiple processors to enhance performance. In addition, it is envisioned that much if not most of the future processor performance increases will result from architectures employing greater numbers of cores, and that future servers may employ greater numbers of processors.
In computer systems, network access is typically facilitated through use of a Network Interface Controller (NIC), such as an Ethernet NIC. In recent years, server NICs have been designed to support for many optimizations for multi-core, multi-processor platform architectures. These optimizations include Receive Side Scaling (RSS) and Application Targeted Routing (ATR). These optimizations were designed around the prior art front-side bus (FSB) platform architecture, as illustrated in FIG. 1.
In further detail, FIG. 1 depicts a simplified front-side bus architecture diagram for a symmetric multiprocessing (SMP) platform. The architecture includes multiple processors 100 coupled to a front-side bus (FSB) 102. Also coupled to FSB 102 is a North bridge 104, which in turn is coupled to memory 106, a high-bandwidth Input/Output (I/O) interface (as depicted by a Platform Component Interconnect Express (PCIe) x8 interface 108), and a South bridge 110. South bridge 110 was typically configured to interface with various platform I/O devices and peripherals, such as depicted by PCIe x4 interfaces 112 and 114.
Under this legacy architecture the network interface controllers were attached via a PCIe interface to either North bridge 104 or South bridge 110, as depicted by NICs 116 and 118. In either case, the NICs communicated to a uniform memory 106 via North bridge 104. All processor 100 accesses to memory 106 were also via North bridge 104. Implementation of RSS and ATR distributed network workloads across cores and, although cache impacts were considered, the primary goal was workload distribution.
Processor architectures have also changed in recent years, moving from discrete components toward a highly integrated approach. For example, for many years, the North-bridge, South-bridge architecture was implemented using physically separate chips for North bridge 104 and South bridge 110 using wired (i.e., board traces) interconnects for the FSB and the interconnect between the North and South bridges. Under a typical highly integrated design employed by today's processors, a processor employing one or more processor cores and logic providing functionality somewhat similar to a North bridge and South bridge are integrated on a single chip with corresponding interconnect wiring embedded in the chip. Under this highly integrated architecture, the processor cores are referred to as the “core” and the rest of the processor circuitry is referred to as the “uncore.”