Exascale systems are projected to integrate on the order of 100,000 processor nodes, with each node capable of providing 8-16 TFLOP/s (trillion floating-point operations per second) peak compute performance. The interconnect fabric of such systems is expected to have approximately 10,000 to 30,000 switch components. With such levels of scale-up, it may be essential that the two key silicon components—namely Processor and Switch—are architected to achieve energy efficiency and performance within the cost targets for such systems. A Processor is an array of a large number of compute cores with memories, interconnected using an on-die interconnect fabric while a switch is primarily just the on-die interconnect fabric which provides connectivity between input-output (IO) ports.
Achievable IO bandwidths for processor/switch components are limited by capabilities of low-cost packaging and energy-efficient signaling technologies. While a die size of the processor is determined by compute logic area required to achieve target performance, a die size of the switch is dictated by the number of IO pins required on the periphery to support the targeted bandwidth. Actual logic area for the switch is miniscule in comparison, resulting into significant “white space” in the silicon. Considering very high cost of mask-sets used for chip fabrication and relatively lower volume requirements of switches, designing standalone switch components is not very cost-effective approach.