1. Field of the Invention
The subject invention relates to Asynchronous Transfer Mode (ATM) networks and, more specifically, to a scheduling discipline in ATM networks which maintains the specified quality of service (QoS) while efficiently handles temporary congestion.
2. Description of the Related Art
The demand on communication networks continues to steadily increase, especially in view of the rapid advent in computing and semiconductor technology. Consequently, it is important to provide adequate traffic control on the networks so as to provide adequate service. The adequacy of service can be evaluated with reference to various parameters, such as the number of packets which get transmitted (i.e., the bandwidth), the speed in which the packets get transmitted, and the number of packets which get discarded (corresponding to the available buffer space).
Generally, the traffic control is implemented at two points. First, an algorithm is provided at each source for controlling the rate at which the source transmits its packets. These algorithms are designed to ensure that a free buffer is available at the destination host. Second, traffic is controlled at the gateways directly using virtual channels and indirectly using queuing. More specifically, routing algorithms directly control traffic by re-routing packets away from congested areas, while queuing algorithms indirectly control traffic by determining the order of servicing the buffers.
Since traffic can be controlled at the sources and at the switches, there may be a tendency to cause one point to rely heavily on the operation of the other. For example, a known prior art gateway traffic control uses a First-Come-First-Serve (FCFS) algorithm which causes the gateway traffic to be practically controlled by the sources. In such prior art systems, all users use the same buffer, so that QoS is the same for all users at the outset. However, the FCFS control is prone to violations by ill-behaved sources who may improve their performance at the expense of other users. For example, using sufficiently high speed transmission, a source can capture much of the available bandwidth, thereby reducing the bandwidth available to other sources. Thus, a philosophy has been developed in the prior art that a fair traffic control should not allow sources to use more than their fair share of the network resources. For example, one solution is to use leaky buckets at the periphery of the network.
A queuing prior art gateway traffic control has been developed in order to avoid abuse by ill-behaved sources. (See, e.g., J. Nagle, On Packet Switches with Infinite Storage, RFC 896, 1985 and J. Nagle, On Packet Switches with Infinite Storage, IEEE Transactions on Communications, Volume 35, pp 435-438, 1987.) According to such prior art systems, the individual sources are given separate queues and the queues are serviced in a round-robin manner. Thus, each source is served in its turn. Therefore, sources with high transmission rate only increase their own queue and do not degrade the service to other sources. However, while such a system may be adequate when all the packets are of the same length, it would favor sources having long packet length over those having short packet length. Additionally, irrespective of packet length, such a system is oblivious to promptness needs of the various sources. consequently an unacceptably high number of packets may be discarded when the promptness requirement is not satisfied by the network.
It is therefore seen that a major design challenge of Asynchronous Transfer Mode (ATM) networks is to be able to efficiently provide the quality of service (QoS) specified by the customers, while avoiding bottlenecks and maintaining fairness (of course, the definition of "fair" may differ from implementation to implementation). To achieve this goal, a well designed scheduling discipline and connection admission control (CAC) algorithm must be implemented in the network switches. Such a scheduling discipline should preferably account for transmission length and immediacy.
For example, certain applications require that the packet be serviced within a given time, or it will be useless to the receiver. Such applications include transmission of voice or video packets over packet-switched networks. Notably, for proper voice communication, the packet transmission delay should be no longer than about 300 ms. Accordingly, for efficient and responsive system, the traffic control should account for the immediacy of the transmission.
To account for immediacy of transmission, an available class of prior art scheduling disciplines handle queues with customers that have deadlines. It is shown by S. Panwar, D. Towsley, and J. Wolf, Optimal Scheduling Policies for a Class of Queues with Customer Deadlines to the Beginning of Services, Journal of ACM, Vol. 35, No. 4, pp. 832-844, 1988, that the shortest time to extinction (STE) policy is optimal to schedule customers with deadlines. According to the STE policy, the customer closest to its deadline is given priority. Similar scheduling is known as Earliest Due Date (EDD), although the STE is different in that it never schedules tasks that are past their due date. However, this class of scheduling algorithms has not considered the users' QoS nor the CAC algorithms. Moreover, the STE and EDD algorithms are very complicated to implement and require much processing time. Consequently, these scheduling methods are not suitable for fast ATM networks, especially if the network is to guarantee the requested QoS.
Recently, rate based scheduling disciplines, such as generalized processor sharing and weighted fair queuing, have received a lot of attention. See, for example, A. Demers, S. Keshav, and S. Shenker, Analysis and simulation of a fair queuing algorithm" J. Internetworking: Res. Exper., Vol. 1, pp. 3-26, 1990; A. Parekh and R. Gallager, A generalized processor sharing approach to flow control in integrated services networks: The single-node case, IEEE Trans. on Networking, Vol. 1, No. 3, pp. 344-357, 1993; 0. Yaron and M. Sidi, Generalized processor sharing networks with exponentially bounded burstiness arrivals, in IEEE Infocom '94, pp. 628-634, 1994; Z.-L. Zhang, D. Towsley, and J. Kurose, Statistical analysis of the generalized processor sharing scheduling discipline, IEEE Journal on Selected Areas in Communications, Vol. 13, No. 6, pp. 1071-1080, 1995; S. Golestani, A self-clocked fair queuing scheme for broadband applications, in IEEE Infocom '94, pp. 5c.1.1-5c.1.11, 1994; L. Zhang, A new traffic control algorithm for packet switched networks, ACM Transaction on Computer Systems, Vol. 9, No. 2, pp. 101-124, May 1991.
In the rate based schemes, each traffic stream has its own buffer and is assigned a nominal service rate. The assignment of the nominal rates is static. The actual service rate that buffer i receives is greater than or equal to its nominal rate, depending on the occupancy of the other buffers in this queue. If all the other buffers are backlogged, the actual service rate of buffer i equals its nominal rate, so as to ensure the specified performance guarantee. Otherwise, the actual service rate can be higher than the nominal rate.
However, with the rate based scheduling disciplines, it may not be easy to determine the nominal service rates that can provide the specified quality of service. If the sources are regulated by leaky buckets or have exponentially bounded burstiness, a bound for end-to-end network delay can be derived. (For a discussion of sources regulated by leaky buckets see Zhang et al. cited above and A. Parekh and R. Gallager, A generalized processor sharing approach to flow control in integrated services networks: The multiple node case, IEEE Trans. on Networking, Vol. 2, No. 2, pp. 137-150, 1994; for exponentially bounded burstiness see Yaron et al. cited above.)
With this bound, it is possible to determine the appropriate service rates to satisfy the end-to-end delay requirement of the sources. A CAC algorithm based on this approach can be designed to provide the required end-to-end delay. However, this approach may result in low utilization of the network, leading to inadequate service during temporary congestion.
In such prior art systems, it is possible to carry out a queuing analysis to determine the suitable nominal service rates to be set in the system. However, since the assignment of the nominal service is static in the rate based schemes, it is possible that during a short interval of time, a particular buffer has much higher arrival rate than its nominal service rate. During this temporary congestion, the performance of the overloaded buffer can be very poor. On the other hand, it is difficult to dynamically adjust the nominal service rates because the rates are determined through a complicated queuing analysis. Consequently, the temporarily congested buffer may receive inadequate service, while other, possibly lightly loaded buffers, may receive unnecessary service. This scenario is very possible and, in fact, it has been demonstrated that well-behaved traffic streams at the edge of the network can become very bursty inside the network, leading to temporary congestion. See, e.g., R. Cruz, A calculus for network delay, part I: Network element in isolation, IEEE Trans. on Info. Theory, Vol. 37, No. 1, pp. 114-131, 1991; R. Cruz, A calculus for network delay, part II: Network analysis, IEEE Trans. on Info. Theory, Vol. 37, No. 1, pp. 131-141, 1991; S. Golestani, Congestion-free communication in high-speed packet networks, IEEE Transactions on Communications, Vol. 39, No. 12, pp. 1802-1812, 1991.
Several longer queue first disciplines have also been previously investigated. The continuous-time two-queue longer queue first priority model is analyzed by J. Cohen, A two-queue, one-server model with priority for the longer queue, Queuing Systems, Vol. 2, pp. 261-283, 1987. The bounds on the buffer size for the longest queue first discipline, assuming fluid flow arrival streams, have been investigated by H. Gail, G. Grover, R. Guerin, S. Hantler, Z. Rosberg, and M. Sidi, Buffer size requirements under longest queue first, in IFIP '92, 1992. It is shown by Gail et al. that the longest queue first discipline requires less buffer space to prevent cell losses than the FIFO and round-robin disciplines.
A problem with the prior art rate based schemes is that they consider the input streams in relative isolation. Consequently, a temporarily overloaded traffic stream may not be able to obtain enough bandwidth to remove the backlog quickly, while other streams are hardly affected during the presence of the temporary overload. Such imbalance was accepted in the prior art and, in fact, was sometimes even promoted. For example, Parekh et al. and Zhang et al. argued that the rate based schemes provide fairness to the traffic streams, in that the misbehavior of one class can not degrade the service to other classes. However, such a philosophy can lead to unnecessary degradation in service during situation when a certain classes is temporarily congested while other classes do not require high level of service.
Even if one assumes that all traffic streams are leaky-bucket policed and shaped when they enter the network, it has been suggested (see Cruz and Golestani articles cited above) that they can still be very bursty inside the network. This added burstiness can cause short term overload to some of the buffers to the extent that the arrival rates to the overloaded buffers are higher than their nominal service rates. The prior art rate based schemes cannot respond quickly enough to this temporary overload because they employ isolation among the traffic classes. However, this added burstiness is caused by the multiplexing and demultiplexing operations of the network and is beyond the control of the users. From this point of view, the present inventors believe that it is unfair to penalize well-behaved users that temporarily exhibit burstiness because of the operation of the network.
Consequently, in addition to being very complicated and computational intensive, weighted fair queue algorithms cannot dynamically adapt to varying load conditions. That is, the weights in weighted fair queuing is determined based upon the source characteristics, as indicated in the received call. However, the characteristics may change as the source's transmission interacts with other users. This change cannot be accounted for in the prior art weight fair queuing.
Accordingly, the present invention has been developed to solve the above problems exhibit by the prior art systems.