The present invention relates to packet communication systems, and in particular to a method and an apparatus for scheduling packets in packet networks for guaranteeing data transfer rates to data sources and data transfer delays from data sources to destinations using a plurality of queues, each of the queues serving data connections with the same guaranteed data transfer rate, and only computing and sorting a single timestamp for each of the queues. This invention can be used in any system for data packet forwarding such as Asynchronous Transfer Mode (ATM) switches and Internet Protocol (IP) routers.
Per-Virtual-Connection (Per-VC) schedulers are known which aim to approximate a Generalized Processor Sharing policy, as described in A. K. Parekh and R. G. Gallager, xe2x80x9cA Generalized Processor Sharing Approach to Flow Control in Integrated Services Networks: The Single-Node Casexe2x80x9d, IEEE/ACM TRANSACTIONS ON NETWORKING, June 1993, pp. 344-357, which is incorporated herein by reference. As defined herein, the term xe2x80x9cVCxe2x80x9d is used throughout to mean xe2x80x9cvirtual connectionxe2x80x9d. It is understood that virtual connections may also include virtual circuits and Internet Protocol (IP) flows. Implementation of such Per-VC schedulers is a central issue in next-generation switching systems. In a market arena in which cost targets are precipitously dropping, an important objective is to minimize the complexity involved in Per-VC schedulers, and to minimize the cost differential with respect to switches using less sophisticated scheduling.
As defined herein and throughout, the term xe2x80x9cGPSxe2x80x9d is an abbreviation for the Generalized Processor Sharing policy, as described in A. K. Parekh et al., supra. GPS-related packet-scheduling disciplines are based on maintaining a global function, referred to by different authors either as a virtual time, such as in A. K. Parekh et al, supra, and in S. J. Golestani, xe2x80x9cA Self-Clocked Fair Queuing Scheme for Broadband Applicationsxe2x80x9d, PROCEEDINGS OF INFOCOM ""94, April 1994, pp. 636-646, which is incorporated herein by reference; or as a system potential, such as described in D. Stiliadis and A. Varma, xe2x80x9cDesign and Analysis of Frame-based Fair Queuing: A New Traffic Scheduling Algorithm for Packet-Switched Networksxe2x80x9d, PROCEEDINGS OF SIGMETRICS ""96, May 1996, pp. 104-115; and D. Stiliadis and A. Varma, xe2x80x9cEfficient Fair Queuing Algorithms for ATM and Packet Networksxe2x80x9d, TECHNICAL REPORT UCSC-CRL-95-59, December 1995, with each of these references being incorporated herein by reference.
The global function tracks the amount of work that is done by the server to process packets in the communication system. The server uses this global function to compute, for each packet in the system, a timestamp that specifies when the packet should be transmitted relative to other packets. Packets are transmitted by increasing order of their timestamps. The specific function used as system potential determines the delay and fairness properties of each algorithm in the class.
The total implementation cost of these GPS-related scheduling algorithms is the combination of three factors: (i) the complexity of the function used as system potential to compute the timestamps for the packets in the system, (ii) the complexity involved in sorting the timestamps in order to select the packet with a minimum timestamp for transmission, and (iii) the cost of handling and storing the timestamps. In recent years, several scheduling algorithms which use a system-potential function of order 0(1) complexity have been introduced. Examples of such algorithms include Self-Clocked Fair Queuing (SCFQ), as described in S. J. Golestani, supra; Frame-based Fair Queuing (FFQ), as described in D. Stiliadis et al., xe2x80x9cDesign and Analysis of Frame-based Fair Queuing . . . xe2x80x9d, supra; Virtual Clock, as described in L. Zhang, xe2x80x9cVirtual Clock: A New Traffic Control Algorithm for Packet Switchingxe2x80x9d, ACM TRANSACTIONS ON COMPUTER SYSTEMS, May 1991, pp. 101-124; and Minimum-Delay Self-Clocked Fair Queuing (MD-SCFQ), described in F. M. Chiussi and A. Francini, xe2x80x9cMinimum-Delay Self-Clocked Fair Queuing Algorithm for Packet-Switched Networksxe2x80x9d, PROCEEDINGS OF INFOCOM ""98, March 1998, each of which is incorporated herein by reference.
In particular, among these algorithms, MD-SCFQ has both optimal delay properties and excellent fairness properties. Scheduling algorithms achieving a desired performance with a system-potential function of minimal complexity are therefore available, but the total performance cost of the scheduler is still dominated by the complexity of sorting and storing the timestamps.
One well-known simplification in timestamp processing by a scheduler is obtained by assigning increasing values of timestamps to consecutive packets which belong to the same session, so that for each session only the timestamp of the packet at the head of the corresponding packet queue is to be considered and processed in the packet selection process. Such a timestamp is referred to as session timestamp. The number of session timestamps which have to be sorted is therefore equal to the number of sessions V supported by the scheduler. For example, typical values of V in current ATM switches, in which sessions are referred to as VCs, are in the order of tens of thousands of sessions. The range of possible values that the timestamps can assume at any given time depends on the ratio between the maximum and minimum service rates that the scheduler is required to provide to the connections. Such a timestamp range is typically very wide.
In view of the complexity in sorting a large number of timestamps over a wide range of possible values at the high speeds employed in broadband digital networks, hardware implementations of packet-switching systems are only affordable by data structures and processor configurations that are specifically devised to be efficiently mapped into silicon on integrated circuits or chips. Even with such specialized structures, the implementation cost may still be too high, and techniques to further reduce complexity are necessary. Different approaches are possible for this purpose. In some cases, the specific properties of a scheduler can help in simplifying the selection process.
Several techniques have been proposed to reduce the cost of the sorting operation. In particular, two approaches are the Logarithmic Calendar Queue (LCQ) introduced in F. M. Chiussi, A. Francini and J. G. Kneuer, xe2x80x9cImplementing Fair Queuing in ATM Switchesxe2x80x94Part 2: The Logarithmic Calendar Queuexe2x80x9d, PROCEEDINGS OF GLOBECOM ""97, November 1997, pp. 519-525; as well as the discrete-rate scheduler presented in J. C. R. Bennett, D. C. Stephens and H. Zhang, xe2x80x9cHigh Speed, Scalable, and Accurate Implementation of Fair Queuing Algorithms in ATM Networksxe2x80x9d, PROCEEDINGS OF ICNP ""97, October 1997, pp. 7-14, each of which are incorporated herein by reference. Both of these approaches are arguably the two approaches that achieve the highest reduction in the hardware complexity of a GPS-related scheduler with optimal delay properties. In addition, such approaches introduce only a very small degradation in the delay bounds of the scheduler.
The LCQ is an optimized calendar queue which reduces the complexity by increasing, in an optimal manner, the granularity of the bins used to sort the timestamps, so that the relative degradation in delay bounds for each connection is equalized.
The discrete-rate scheduler is a relatively simple structure that can be used when the guaranteed service rates that the scheduler needs to support at any given time only belong to a relatively small set of discrete values. Such operating conditions are certainly realistic in most, if not all, ATM switches. As shown in FIG. 1, the illustrated discrete-rate scheduler 10 is a per-connection-timestamp scheduler having a corresponding timestamp for each of the sessions; for example, the sessions 14-16 in FIG. 1. Each of the sessions 14-16 has a corresponding timestamp 20-26, respectively.
Other advantages are known for using a discrete set of rates. In this case, connections with the same service rate are grouped together in common rate First-In-First-Out (FIFO) queues, and scheduling is performed only among the connections at the head of each rate FIFO queue. Accordingly, the per-connection timestamp scheduler 10 in FIG. 1 has the plurality of registers for storing pointers as heads 12 and tails 18 for maintaining the number N of rate FIFO queues 28, with the sessions 14-16 in a given queue having the same rate from among rates r1 . . . rN. Thus, the registers 12 and 18 in Rate FIFO Queue 1 are associated with a common rate r1, with a first head 12 labeled HEAD(1) as a head pointer pointing to a session in the queue and having an associated timestamp 20 labeled FHEAD(1); a first set of sessions 14-16 labeled VC1,A and VC1,B. respectively, and having associated timestamps 22, 24 labeled F1,A and F1,B, respectively; and a first tail TAIL(1) 18 as a tail pointer to a session in the queue and having an associated timestamp 26 labeled FTAIL(1). As described above, scheduling is performed by processing the sessions pointed to at the heads of the queues, with such sessions being processed by a smallest-eligible-virtual-finishing-time-first (SEFF) selector 30 to determine a minimum eligible timestamp for service from among the sessions pointed to by the heads of the queues 28.
The implemented scheduler may have certain properties for the maximum distance between timestamps of different connections having the same rate, and for the relation between system potential and timestamps, as is the case for the worst-case-fair weighted fair queuing system (WF2Q, or alternatively WF2Q+) described in J. C. R. Bennett and H. Zhang, xe2x80x9cHierarchical Packet Fair Queuing Algorithmsxe2x80x9d, PROCEEDINGS OF SIGCOMM ""96, August 1996, pp. 143-156; and other worst-case-fair schedulers, such as described in D. Stiliadis and A. Varma, xe2x80x9cA General Methodology for Designing Efficient Traffic Scheduling and Shaping Algorithmsxe2x80x9d, PROCEEDINGS OF INFOCOM ""97, April 1997, with each of these articles being incorporated herein by reference.
By grouping together connections with the same service rate in common rate FIFO queues, the number of timestamps to be sorted is greatly reduced, for example, to be equal to the number of supported rates, and therefore the complexity of the sorting task is considerably decreased.
Although being important improvements in reducing the implementation complexity of GPS-related schedulers with near-optimal delay bounds, known approaches still require computing and storing a timestamp for each connection. The LCQ and other techniques presented in J. L. Rexford, A.G. Greenberg and F. G. Bonomi, xe2x80x9cHardware-Efficient Fair Queuing Architectures for High-Speed networksxe2x80x9d, PROCEEDINGS OF INFOCOM ""96, March 1996, pp. 638-646, which is incorporated herein by reference; do not require per-connection timestamps when used to implement SCFQ, but this scheduling algorithm does not achieve near-optimal delay bounds.
However, because of the need of per-connection timestamps, the overhead in memory resources due to the scheduler remains significant. This is especially true in ATM, in which the size of the packet is relatively small, and the number of connections supported by the system is on the order of several thousands of connections. The number of connections is often of the same order of magnitude as the number of packets in a buffer, and that number is constantly increasing as the industry evolves to implement improved data packet networks.
It is recognized herein that it is possible to introduce approximations in the scheduling algorithms in order to simplify their implementation. In general, these approximations may negatively affect the delay and/or fairness properties of the scheduler, and the challenge is to optimize the design so that the degradations induced by the approximations are minimized.
It is an object of the present invention to provide a technique to further reduce the implementation cost of GPS-related schedulers with near-optimal delay bounds, using a No-Per-Connection Timestamp Discrete-Rate Scheduler. This scheduler does not require the computation and storage of a timestamp per connection, and only maintains a single timestamp per rate. The scheduler has a very simple two-level hierarchical structure, in which, at the lower level of the hierarchy, FIFO queues of connections are used with one FIFO queue per rate and with connections being queued without a timestamp. In the scheduler, a timestamp is only assigned to each rate FIFO queue, taking into account the aggregate bandwidth of all connections with that rate.
At the higher level of the hierarchy, the scheduler uses a worst-case-fair scheduler which schedules among the different FIFO queues. Any packet-by-packet rate-proportional server (P-RPS) as well as MD-SCFQ with a shaping mechanism can be used as a second-level scheduler. Such schedulers have been shown to constitute worst-case-fair schedulers. The shaping mechanism is obtained by adopting the SEFF packet-selection policy, which considers for service only packets whose transmission has already started in the underlying fluid system. If the worst-case-fair scheduler used is work-conserving, the resulting discrete-rate scheduler is also work-conserving.
The no-per-connection-timestamp scheduler has the same near-optimal delay properties of existing discrete-rate schedulers. The only disadvantage in the elimination of the timestamps per connection is some degradation in the fairness properties.