§3.1 Field of the Invention
The present invention concerns switches used in data communications networks. In particular, the present invention concerns schedulers used in buffered crosspoint switches.
§3.2 Background Information
With the growing demand of Internet traffic, there is an increasing interest in designing high performance packet switches. Due to memory speed constraints, input queuing, or combined input and output queueing (CIOQ), is used with bufferless crossbar switching fabrics. With input-queueing, at each input port, there is a separate queue corresponding to each output, known as virtual output queues (VOQs). VOQs are used to avoid head-of-line (HOL) blocking. A bufferless crossbar switching fabric is used to transfer cells from inputs to outputs. However, such switches usually require complex scheduling algorithms to achieve good performance, such as maximum weight matching, maximal and maximum size matching, or iterative schedulers. While some schedulers have simpler complexity (e.g., O(log N), where N is the number of ports in the switch), they still suffer from delays that grow with N.
To provide good performance, while addressing the complexity issue of scheduling algorithms, one approach is to add limited buffers inside the crossbar switch fabric. With present application specific integrated circuit (“ASIC”) technology, a large amount of memory can be implemented in a single chip. This makes buffered crossbar switches an attractive solution compared to the traditional input-queued switch because of their potentially simpler scheduling algorithms and better delay performance.
A scalable buffered crossbar switch architecture can leverage the opportunities offered by Proximity Communication. (See, e.g., R. Drost, R. D. Hopkins, R. Ho, and I. Sutherland, “Proximity Communication,” IEEE Journal on Solid-State Circuits, vol. 39, no. 9 (September 2004). (This article is incorporated herein by reference.)) Conventionally, large switch fabrics with hundreds of ports could be constructed only by connecting switch chips in a hierarchical way using a multi-stage topology. With Proximity Communication, there is enough chip-to-chip bandwidth available such that large switch fabrics can be built with a single stage topology. This is done simply by dividing a large crossbar into several smaller crossbars which are then “stitched together” through Proximity Communication. A single-stage switch offers many advantages over a multi-stage switch. Therefore, it is desirable to design a scheduler that can scale up to a large number of ports and make scheduling decisions in the short time given by a high-speed cell-based switch, while achieving 100% throughput at the same time.
With a speedup of two (2), the authors in S- T. Chuang, S. Iyer, and N. McKeown, “Practical Algorithms for Performance Guarantees in Buffered Crossbars,” Proceedings of IEEE Infocom, Miami, Fla. (March 2005)(incorporated herein by reference) showed that a buffered crossbar can provide guaranteed performance (throughput, rate, delay). In the paper, J. Turner, “Strong Performance Guarantees for Asynchronous Crossbar Schedulers,” Proceedings of IEEE Infocom, Spain (April 2006)(incorporated herein by reference), the results are extended to variable size packets. The author in M. Berger, “Delivering 100% Throughput in a Buffered Crossbar with Round Robin Scheduling,” Proceedings of IEEE High Performance Switch and Routing, Poznan, Poland (2006)(incorporated herein by reference) proved that the speedup requirement can be reduced to 2−1/N. However, without speedup, the throughput results are only limited to uniform traffic loads. Under uniform traffic, it has been shown that a simple round-robin scheduler can provide 100% throughput. (See, e.g., T. Javidi, R. Magill, and T. Hrabik, “A High Throughput Scheduling Algorithm for a Buffered Crossbar Switch Fabric,” Proceedings of IEEE International Conference on Communications, (2001) (Incorporated herein by reference).) In the paper, R. Rojas-Cessa, E. Oki, and H. J. Chao, “On the Combined Input-Crosspoint Buffered Packet Switch with Round-Robin Arbitration,” IEEE Transactions on Communications, vol. 53, no. 11, pp. 1945-1951 (November 2005)(incorporated herein by reference), the authors proved that the longest-queue-first at the input port and round-robin at the output port (LQF-RR) guaranteed 100% throughput under uniform traffic.
In the paper P. Giaccone, E. Leonardi, and D. Shah, “On the Maximal Throughput of Networks with Finite Buffers and its Application to Buffered Crossbars,” Proceedings of IEEE Infocom, Miami, Fla. (March 2005)(Incorporated herein by reference), the authors proposed a distributed scheduling algorithm and derived a relationship between throughput and the size of crosspoint buffers. Unfortunately, however, to achieve 100% throughput, the switch described needed an infinite buffer. With current state-of-the-art technology, the total amount of buffers that can be built on chip is limited. For a switch with large number of ports, the buffer size associated with an input-output pair should be kept small.
In the paper, L. Tassiulas, “Linear Complexity Algorithms for Maximum Throughput in Radio Networks and Input Queued Switches,” Proceedings of IEEE INFOCOM 1998, vol. 2, pp. 533-539, New York, N.Y. (1998) (incorporated herein by reference), the author studied randomized algorithms that achieve 100% throughput for an input-queued switch. The approach works as follows. In each time slot, a feasible solution to the maximum weighted matching problem is obtained. If the value of the new solution is higher than the value of the current solution, the latter is replaced. Using this approach guarantees achieving 100% throughput under the condition that the probability that the new solution is equal to the maximum weight matching is strictly greater than zero. A de-randomized version of this algorithm was proposed in the paper, P. Giaccone, B. Prabhakar, and D. Shah, “Toward Simple, High Performance Schedulers for High-Aggregate Bandwidth Switches,” Proceedings of IEEE INFOCOM, New York, N.Y. (2002) (incorporated herein by reference), where a Hamiltonian walk is applied instead of randomly generating a new schedule. However, such approaches introduced a large delay. Several approaches have been proposed for input-queued switch to reduce the delay. (See, e.g., P. Giaccone, B. Prabhakar, and D. Shah, “Toward Simple, High Performance Schedulers for High-Aggregate Bandwidth Switches,” Proceedings of IEEE INFOCOM, New York, N.Y. (2002) (incorporated herein by reference); and Y. Li, S. S. Panwar, and H. J. Chao, “Exhaustive Service Matching Algorithms for Input Queued Switches,” Proceedings of IEEE Workshop on High Performance Switching and Routing (2004) (incorporated herein by reference).)
In view of the foregoing, improved scheduling techniques would be useful, especially for large scale switches. It would be useful if such techniques had acceptable delay, throughput and hardware attributes.