1. Field of the Invention
The present invention relates to a cell scheduler and, more particularly, pertains to a cell scheduler which supports a variety of scheduling schemes or disciplines and its implementation in a distributed shared memory architecture.
2. Description of the Related Art
With the completion of the ATM (Asynchronous Transfer Mode) Forum Traffic Management 4.0 Specification published February, 1996, by The ATM Forum, Worldwide Headquarters, 2570 West El Camino Real, Suite 304, Mountain View, Calif. 94040-1313, several new traffic management enhancements have become possible in ATM networks. Previous generation of ATM switches were not designed to take advantage of these enhancements, and hence their incorporation requires a new generation of ATM switch architecture and scheduler designs. The purpose of this invention is to describe a new scheduler design that maintains continuity with the previous generation of traffic management techniques, and at the same time makes possible the exploitation of the advanced capabilities found in the ATM Forum Traffic Management 4.0 specification.
Prior art cell scheduling methodologies fall into the following categories:
1. There is provision for four service classes, constant bit rate (CBR), variable bit rate (VBR), available bit rate (ABR) and unspecified bit rate (UBR), at each output link and First-in First-out (FIFO) queuing is used internally within each service class. Virtual channels (VC) belonging to the CBR service class are given the highest priority, followed by those of VBR, ABR and finally UBR.
2. As in 1), there is provision for four service classes at each output link with CBR given the highest priority and UBR the lowest, but instead of FIFO queuing, per-VC queuing is used. Furthermore, the VCs within each priority group are served using the Round-Robin (RR) service discipline.
3. As in 1) there are four service classes at each output link and FIFO queuing is used internally within each class. However, instead of priority scheduling, a Weighted Round Robin (WRR) based scheduler is used to serve these classes.
Using plain priority classes as in 1), it is not possible to provide bandwidth or delay guarantees to individual VCs. Moreover, fairness between VCs for ABR and UBR connections is also not possible. If per-VC queuing is used, as in 2), then it solves the fairness problem, but however it is still not possible to guarantee BW or delay. The provision for WRR in 3) solves the bandwidth allocation problem between service classes, but not within any one service class.
The following section provides an overview of how the different ATM Traffic Classes are supported by the scheduler of the present invention. The ATM traffic classes include: CBR sources, Real-Time VBR sources, Non Real-Time VBR sources, ABR sources and UBR sources.
There is a need for a scheduler, an instance of which resides in each output port, that is able to simultaneously satisfy the Quality of Service (QoS) performance requirements of these traffic classes. The Weighted Fair Queuing algorithm (WFQ) in combination with per-VC queuing can be used to explicitly reserve link bandwidth(BW) for classes which require it, such as CBR, VBR and ABR with MCR support. In addition, it also leads to guaranteed upper bounds for scheduling delay, which is very important for providing real time services over ATM networks.
Furthermore, it is desirable to provide a scheduler which supports the following features:
1. Support for both per-VC queuing as well as plain FIFO queuing.
Per-VC queuing should clearly be supported so that it can be used in conjunction with WFQ to provide explicit BW and delay guarantees to CBR, real-time VBR and ER based ABR sources. In addition, FIFO queuing should be supported for some traffic classes:
FIFO queuing aggregates several VCs together, hence it reduces the requirements on the control memory to store extra queue pointers and other control data structures. PA1 Non-real time VBR sources do not require explicit delay guarantees. Hence several of these sources may share a single FIFO buffer, whose size may be chosen according to the cell loss requirements of these streams. This FIFO can be granted the aggregate BW of all its constituent VCs. PA1 There may be customers who would prefer to do plain FIFO queuing for CBR sources. PA1 It may be possible to support non-ER ABR sources and UBR sources by means of per-VC accounting, rather than full-blown per-VC queuing. PA1 Ease migration from present switches that only have FIFO queuing. PA1 Memory speeds: As the size and the speed of the switch fabric increases, it leads to the requirement for faster and faster memories. Since memory speeds are restricted by the current technology, this necessarily restricts the size of the switch fabric. PA1 Interconnect speeds: For larger fabrics the speed of the interconnects between adjoining switching modules is crucial. Speed-up achievable by using wider buses and faster clocks is restricted by physical limitations, which also restricts the size of the fabric. PA1 Switch control and traffic management: A switch fabric is useless unless it can provide support for sophisticated traffic management functions. This also restricts the types of fabrics that are possible, since an otherwise excellent fabric may not be able to satisfy this requirement. PA1 Support for multicast: This function is extremely important, especially for network control and multimedia applications. PA1 Pure output buffered fabrics: These do not scale since the speed of the memory in each output port increases linearly with the number of ports. PA1 Pure input buffered fabrics: These fabrics offer the possibility of scaling up without increasing the speed of the memory located in the input port. However they lead to complexity of the arbitration mechanism required to overcome head-of-line blocking. More importantly, there are no good solutions to doing scheduling for and reserving BW in these fabrics. Multicasting is also a problem. PA1 Shared memory/shared bus fabrics: These represent the most commonly found fabric in commercially available switches. These fabrics do not scale very easily beyond speeds of 20 Gbps due to limitations in speeds of memory modules. PA1 Distributed shared memory fabrics: These fabrics seek to scale up to inter-connecting shared memory modules of size n*n in a square pattern. Thus they are able to scale up without increasing the speed of the memory, however traffic management becomes a problem as the number of modules increases. Also the number of switching elements required increases rapidly with the number of ports. PA1 Multistage interconnection networks: These seek to build a larger fabric by combining modules of a fixed size, say 2*2, in an interconnection pattern. These networks are able to scale up without increasing the speed of the memory or the interconnects. However, as in the case of the input buffered fabric, traffic management and multicasting are difficult to support in these fabrics. The main reason for this is that these architectures lead to queuing inside the fabric itself.
2. Support for a very flexible mix of priorities and bandwidth partitioning based scheduling.
Present generation ATM switches rely solely on the priority mechanism to segregate different traffic classes from each other. In these schemes CBR is given the highest priority, followed by VBR, ABR and UBR. The main problem with this scheme is that it is no longer possible to give delay guarantees to lower priority classes (for e.g. rt-VBR).
The presence of the WFQ scheduler provides a more powerful mechanism to segregate traffic classes from each other, without the drawback mentioned above. This is due to the fact that WFQ builds firewalls between competing flows, and also allows redistribution of un-used BW among active flows. One alternative for implementing the scheduler is to rely exclusively on WFQ to segregate traffic classes. Each traffic class will have an upper and lower bound on the BW that it can get. The BW given to CBR and VBR sources cannot be taken away while the connections are still active, however the BW given to ABR and UBR sources may decrease during the course of a connection (if new CBR or VBR connections come up, for instance). These sources then adjust to the decreased BW by means of explicit feedback mechanisms.
The main problem with relying only on WFQ to segregate traffic classes is that, the network explicitly needs to assign upper and lower bounds to the bandwidth that any single class can acquire. This may be a burden especially for larger networks.
With regard to implementation of the scheduler design of the present invention in a distributed shared memory architecture, the following technical hurdles must be addressed in designing and selecting an appropriate switch fabric:
Existing switch fabrics designs fall into the following classes:
Accordingly, an object of the present invention is to provide a scheduler which supports a mixture of per-VC and FIFO queuing.
Another object is to provide a design for such a scheduler and its implementation in a distributed shared memory switch architecture.
Another object is to provide an improvement to the traditional distributed shared memory switch fabric for such a scheduler making it possible, by virtue of these improvements, to build much larger switch fabrics as compared to the traditional distributed shared memory fabric, using lower speed memories and a smaller number of switch modules.
Another object is to provide a shared memory fabric architecture for high speed ATM switches which resolves the problems of memory and interconnect scalability, i.e., it can scale up to very high speeds using fixed memory and interconnect speeds, while at the same time providing excellent support for traffic management and multi-casting.
Yet another object is to provide a set of integrated circuit chips for implementing the switching fabric and the input/output ports belonging to that fabric.