§ 1.1 Field of the Invention
The present invention concerns the communication of data over networks, such as the Internet for example. More specifically, the present invention concerns scheduling the servicing (e.g., dispatching) of cells or packets buffered at input ports of a switch.
§ 1.2 Related Art
Switches and routers are used in networks, such as the Internet for example, to forward data towards its destination. The need for high-speed switches and routers is introduced in § 1.2.1 below. Then, input buffering, as used in high-speed switches, is introduced in § 1.2.2 below.
§ 1.2.1 The Need for Large-Scale and High-Speed (e.g., Terabit) Routers and Switches
Many expect that Internet traffic will continue to grow explosively. Given this assumption, it is expected that high-speed switches and routers (e.g., those having a throughput over one Terabit per second) will become necessary. Most high-speed packet switches adopt a fixed-size cell in the switch fabric. If variable length packets are to be supported in the network, such packets may be segmented and/or padded into fixed-sized cells upon arrival, switched through the fabric of the switch, and reassembled into packets before departure. Input buffering is introduced below in § 1.2.2 as a way to handle these incoming cells.
§ 1.2.2 Buffering in High-Speed Routers and Switches
There are various types of buffering strategies in switch architectures: input buffering, output buffering, or crosspoint buffering. Information on these strategies can be found in the following articles: G. Nong and M. Hamdi, “On the Provision of Quality-of-Service Guarantees for Input Queued Switches,” IEEE Commun. Mag., Vol. 38, No. 12, pp. 62–69 (Dec. 2000); E. Oki, N. Yamanaka, Y. Ohtomo, K. Okazaki, and R. Kawano, “A 10-Gb/s (1.25 Gb/s×8) 4×2 0.25-micrometer CMOS/SIMOX ATM Switch Based on Scalable Distribution Arbitration,” IEEE J. Solid-State Circuits, Vol. 34, No. 12, pp. 1921–1934 (Dec. 1999); and J. Turner and N. Yamanaka, “Architectural Choices in Large Scale ATM Switches,” IEICE Trans. Commun., Vol. E81-B, No. 2, pp. 120–137 (Feb. 1998). Each of these articles is incorporated herein by reference. Input buffering is a cost effective approach for high-speed switches. This is because input-buffered switches do not require internal speedup, nor do they allocate buffers at each crosspoint. They also relax memory-bandwidth and memory-size constraints.
§ 1.2.2.1 The Use of Virtual Output Queues to Avoid Head-of-Line Blocking
It is well known that head-of-line (“HOL”) blocking limits the maximum throughput (e.g., to 58.6%) in an input-buffered switch with a First-In-First-Out (FIFO) structure. See, e.g., the article, M. J. Karol, M. G. Hluchyj, and S. P. Morgan, “Input Versus Output Queuing on a Space-Division Packet Switch,” IEEE Trans. Commun., Vol. COM-35, pp. 1347–1356 (1987). This article is incorporated herein by reference. The article, N. Mckeown, “The iSLIP Scheduling Algorithm for Input-Queued Switches,” IEEE/ACM Trans. Networking, Vol. 7, No. 2, pp. 188–200 (April 1999), shows using a Virtual-Output-Queue (VOQ) structure to overcome HOL-blocking. This article is incorporated herein by reference.
In an input-buffered switch that uses VOQs, a fixed-size cell is sent from any input to any output, provided that, in a given time slot, no more than one cell is sent from the same input, and no more than one cell is received by the same output. Each input port has N VOQs, one for each of N output ports. The HOL cell in each VOQ can be selected for transmission across the switch in each time slot. Therefore, every time slot, a scheduler has to determine one set of matching. That is, for each of the output ports, the scheduler may match one of the corresponding VOQs with the output port.
§ 1.2.2.2 Maximum-Sized and Maximal-Sized Matching Algorithms in High Speed Switches
Maximum-sized matching algorithms to schedule the input-output matching for input-buffered switches with VOQs, that achieve 100% throughput have been proposed. See, e.g., the articles: J. E. Hopcroft and R. M. Karp, “An Algorithm for Maximum Matching in Bipartite Graphs,” Soc. Ind. Appl. Math J. Computation, Vol. 2, pp. 225–231 (1973); and N. Mckeon, A. Mekkittikul, V. Anantharam, and J. Walrand, “Achieving 100% Throughput in Input-Queued Switches,” IEEE Trans. Commun., Vol. 47, No. 8, pp. 1260–1267 (August 1999). These articles are incorporated herein by reference. Unfortunately, these algorithms are hard to implement in high-speed switches because of their high computing time complexity.
Maximal-sized matching algorithms have been proposed as an alternative to the maximum-sized matching ones. Two of these algorithms, iSLIP and Dual Round-Robin Matching (DRRM), are described in the articles: N. Mckeown, “The iSLIP Scheduling Algorithm for Input-Queued Switches,” IEEE/ACM Trans. Networking, Vol. 7, No. 2, pp. 188–200 (April 1999); H. J. Chao and J. S. Park, “Centralized Contention Resolution Schemes for a Large-Capacity Optical ATM Switch,” Proc. IEEE ATM Workshop '97, Fairfax, Va. (May 1998); and H. J. Chao, “Saturn: A Terabit Packet Switch Using Dual Round-Robin,” IEEE Commun. Mag., Vol. 38, No. 12, pp. 78–84 (December 2000). These articles are incorporated herein by reference. The computing complexity of the iSLIP and DRRM methods are less than maximum matching methods. Moreover, the iSLIP and DRRM methods provide 100% throughput under uniform traffic and complete fairness for best-effort traffic. However, in each of these methods, the maximal matching is to be completed within one cell time slot. Such a constraint may become unacceptable as the switch size increases and/or the port speed becomes high, because the arbitration time becomes longer than one time slot or the time slot shrinks, respectively. For example, for a 64-byte fixed-length cell at a port speed of 40 Gbit/s (OC-768), the computation time for completing maximal-sized matching is only 12.8 ns. Existing proposals for relaxing the time constraints are discussed below in § 1.2.2.3.
§ 1.2.2.3 Round-Robin Greedy Scheduling (RRGS)
To relax the scheduling timing constraint, a pipelined-based scheduling algorithm called Round-Robin Greedy Scheduling (RRGS) is proposed in the article: A. Smiljanic, R. Fan, and G. Ramamurthy, “RRGS—Round-Robin Greedy Scheduling for Electronic/Optical Terabit Switches,” Proc. IEEE Globecom '99, pp. 1244–1250 (1999). This article is incorporated herein by reference. With RRGS, each input has only to perform one round-robin arbitration within one time slot to select one VOQ. However, if a given switch has N inputs, then N input round-robin operations (that select its cell to be transmitted at a given time slot T) are allocated into the different previous N time slots {T−N, T−N+1, . . . , T−1} in a simple cyclic manner so that RRGS can avoid output contention.
Unfortunately, RRGS can't provide max-min fair share for a best-effort service. For example, let λ(i,j) be the input offered load to VOQ(i,j) and let μ(i,j) be the acceptable transmission rate from VOQ(i,j). Consider a 3×3 switch in which λ(0,0)=λ(1,0)=1.0 and in which other input offered loads λ(i,j)=0. According to the RRGS algorithm, the acceptable transmission rate is obtained as μ(0,0)=⅔ and μ(1,0)=⅓. Thus, when traffic is not balanced, some inputs can unfairly send more cells than others. The article, A. Similjanic, “Flexible Bandwidth Allocation in Terabit Packet Switches,” Proc. IEEE Workshop on High Performance Switching and Routing 2000, pp. 233–239 (2000) proposes weighted-RRGS (“WRRGS”), which guarantees pre-reserved bandwidth. This article is incorporated herein by reference. However, even with WRRGS, fairness is not guaranteed for best-effort traffic. In addition, once every N time-slot cycles, an idle time slot is produced when N is an even number. This means that RRGS does not maximize its use of the switching capacity.
§ 1.2.3 Unmet Needs
In view of the foregoing limits of proposed maximal matching scheduling schemes, one that (i) relaxes the scheduling time into more than one time slot, (ii) provides high throughput, and/or (iii) maintains fairness for best-effort traffic, is still desired.