Conventionally, networks in data centers, High-Performance Computing (HPC), and the like are built with rigidly structured architectures. Some examples known in data center networks are Fat Tree (Clos), Dragonfly, Slim Fly, and B-Cube. Specifically, a Fat Tree or Clos network is frequently used in modern data centers. These networks suffer from some well-known problems. First, there is increased latency due to many hops, especially as the number of layers grows with structured network architectures. High network loads can produce filled switch buffers, increasing latency. Second, structured network architectures are deployed in discrete implementation sizes, and higher layer ports may go unused in an underfilled network. FIG. 1 is a network diagram of a three-layer leaf-spine folded-Clos network 10 with various switches 12 interconnecting servers 14. Of note, the Clos network 10 is used to draw comparisons with the systems and methods described herein. As an example, with the three-layer (L=3) Clos network 10 using k-port switch, the following relationships between port count and server and switch counts exist: k=24→servers=3456, switches=720; k=32→servers=8192, switches=1280; k=64→servers=65536, switches=5120. For Clos Computations (k-port switches, L switching layers): the number of layers required: L=log(Nserv/2)/log(k/2)˜log(Nserv)/log(k); the number of servers: Nserv=k*(k/2)*(k/2) . . . =2*(k/2){circumflex over ( )}L. The total switch count is: Nswitch=(2L−1)*(k/2){circumflex over ( )}(L−1).
Third, structured network architectures have difficulty in horizontal scaling by requiring multiple layers. Horizontal scaling is explained as follows. In general, hardware devices such as Application Specific Integrated Circuits (ASICs) are port limited by available pins. This means bandwidth can increase, but usually, most increases are achieved by increasing port speeds such as 25G to 56G, i.e., port counts are difficult to increase. However, port counts determine horizontal fan-out capability such as in the Clos network 10. Therefore, network horizontal scale growth will eventually face problems in terms of network layer increases. Each layer requires interconnect, which requires high power backplanes and/or expensive optics. Fourth, structured network architectures are susceptible to cluster-packing problems which confine processing jobs within clusters to reduce latency and improve efficiency. However, the processor (CPU), storage, etc. resources in the cluster must then be sized to anticipate large loads and can often be underutilized when the loads are smaller.
Fifth, the number of required Application Specific Integrated Circuit (ASIC) increases super-linearly as number of layers increases causing further problems. The capacity increase experienced in the data center networks is far outpacing what the semiconductor industry is able to offer from packet switch ASIC bandwidth growth. And this problem is likely to be exacerbated in the future such as due to 5G wireless networks, the proliferation of Internet of Things (IoT) devices, and edge cached content with massive replication all of which continue driving huge data bandwidth. At the same time, complementary metal-oxide-semiconductor (CMOS) lithography and packaging pin limits constrain packet ASIC bandwidth growth. A solution is needed that avoids the high port count (i.e., Radix) requirements being placed on packet switch ASICs. One approach to solving the packet ASIC Radix problem is to stack multiple ASICs into larger boxes. For example, one implementation has a 12×ASIC count increase for a corresponding 4× port increase, but this is inefficient as the port increase is 3× less than the overall ASIC count increase.
Another previously proposed approach is to use optical switching in the data center. One example includes an architecture provided by Plexxi with highly structured optical interconnects as well as electrical switching. There are numerous challenges with this approach including the physical topology limited to ring-type configurations, a small number of direct interconnects between nodes, scaling problems, bandwidth limitations on each network connection, centralized control, and the like. Another example includes all-optical switching where optical switches establish direct optical connections with long persistence between predefined sets of servers or racks. This approach is useful for cases where applications understand their hardware environment and can anticipate bandwidth requirements, but this approach requires centralized control limiting its utility.
Yet another conventional approach is to use optical circuit switches as a bypass to offload traffic from core packet switches. Here, electronic packet switches used in upper network layers can be made smaller. However, given the slow nature of optical circuit switching, it naturally targets large flows. Top of Rack (TOR) switches group smaller flows with many destinations into the electrical core. Larger, more persistent flows go through pre-configured optical direct connect paths. This approach requires very tight coordination between the application layer which is aware of its upcoming bandwidth demands and the networking layer which provides reconfigurable bandwidth. Given the latency and granularity, this approach is only useful for very large and persistent application bandwidth demands.
As data center networks continue to grow, there is a need to rethink the network architecture to address the aforementioned limitations. The aforementioned conventional solutions all focus on a “better box” (e.g., higher port count, density, etc.) or a “better software app” (e.g., centralized control, etc.). These solutions are ineffective to address the large-scale growth in the data center network, and a more comprehensive approach is needed.