As the volume of Internet traffic continues to grow, telecommunications equipment providers and telecommunications service providers continue working to increase network transmission capacity between switches and switching capacity within switches. The rapid growth in network transmission capacity, however, has shifted the network bottleneck from network transmission capacity to switching capacity. Due to the need to rapidly scale switching capacity to match the fast growth of network transmission capacity, high-speed switch architectures have been intensely studied recently. The most common types of switches include shared-memory switches, output-queued switches, and input-queued switches.
A basic issue in scaling switching capacity is memory-bandwidth, particularly for output-queued switches since speed-up proportional to switch size is required at the output ports. While input-queued switches do not inherently require such speedup, input-queued switches may have throughput bounds due to output port contention. The potential of input-queued switches for maintaining good switching performance with reduced memory-bandwidth requirements (as compared to shared memory switching architectures and output-queued switching architectures) have made input-buffered switches preferred for high-speed switching applications. Disadvantageously, however, existing input-queued switches have numerous issues impacting both switch stability and switching throughput.
While input-queued switches avoid the requirement of large speed-up at the output ports, a major bottleneck in scaling input-queued switches is the scheduler needed to keep the input-queued switch stable and achieve high throughput. Several approaches have been proposed to solve this problem, including pipelining the implementation, reducing the frequency of scheduling computations, and eliminating the scheduler altogether using a two-stage load-balancing architecture. Such approaches, however, fail to keep the input-queued switch stable and achieve high throughput. Furthermore, approaches which eliminate the scheduler in favor of a two-stage load-balancing architecture require a speed-up of the switch and additional mechanisms to re-sequence out-of-order packets, and often do not have good performance with respect to packet delay.
In existing input-queued switches, a scheduler uses a scheduling algorithm which determines matching (e.g., maximum weighted matching problem) between input ports and output ports. As the number of ports and associated line speeds continue to increase, it becomes increasingly difficult to solve the matching problem within required time frames. For example, for lines having line rates of 40 Gbps (OC768) that convey 64-byte packets, a match must be computed every 12.8 nanoseconds. Although approaches to decrease the frequency of matching computations have been proposed (e.g., decreasing frequency of matching by increasing packet size, using the same matching for multiple time frames, or pipelining the matching computation), such approaches fail to keep the input-queued switch stable and achieve high throughput.