An increasing array of online services and applications are provided online—from search and social networks, to entertainment and streaming video, to healthcare and government systems. Each of these applications relies on enormous amounts of data processing to provide useful content to the end user, and the underlying compute and storage infrastructure needed to support these applications are increasingly hosted in Internet data centers. Data centers may exhibit enormous scale hosting hundreds of thousands of servers.
The high cost, power demand, and complexity hinder the adoption of the full bisection bandwidth topologies, such as FatTrees, in data center networks. Data center operators instead typically rely on oversubscription to reduce network cost and power by providing a reduced quantity of bisection network bandwith. The downside of oversubscription is poor application performance and poor server utilization, since servers have to wait for data to arrive over the congested network fabric. More recently, a number of researchers have proposed reconfigurable network topologies, such as switched optical pathways. Reconfigurable network topologies offer very high bisection bandwidth but do not require several layers of network switches as in FatTrees.
The relatively low costs of reconfigurable optical network topologies make them promising candidates for data center networks, nevertheless, there are still two main challenges for the adoption of reconfigurable optical circuits. Firstly, since reconfigurable optical circuits are inherently bufferless, data must be buffered at the source before transmission. Bufferless circuit-based networks are fundamentally different from buffered packet-switched networks. Since data transmissions cannot rely on buffers along the path, the network control plane must ensure that data is ready to send along the end-to-end circuit, with buffering only at the edge of the network. This network topology can be viewed as a single crossbar interconnecting the top of rack (ToR) switches, except that the full bisection bandwidth is not guaranteed. Specifically, due to the topology constraint, there are certain circuit configurations that could not allow all the ToR switches to transmit at the same time.
Secondly, candidate optical circuit switching technologies (such as “binary MEMS” mirror arrays) typically exhibit a reconfiguration delay when the circuit configuration is changed. This delay is a period where data cannot flow through the switch, and for practical circuit switch technologies, this reconfiguration delay is significantly longer than the link layer interframe gap. For example, the reconfiguration delay for state of the art binary MEMS is 2-20 μs, which is significantly larger than the interframe gap of 9.6 ns. This nonzero reconfiguration delay motivates the need for scheduling policies that account for the reconfiguration delay.