The present invention relates to a method and apparatus of rate-based scheduling and weighted fair sharing of a common resource. The problem of rate-based scheduling and weighed fair sharing arise in many different contexts and relate, for example, to the field of computer networks or to processor design. In general, the present invention relates to any problem of scheduling jobs according to some rates in a broad context of environments and application.
The problem of scheduling different jobs sharing a common resource occurs in many different contexts. In the most general terms it can be formulated as follows:
A single resource of some kind is shared by several entities indexed by integers i=1,2, . . . n. Every entity has a rate R(i) associated with it. The rates are assigned in such a way that sum of all R(i) does not exceed the capacity of the resource. For example, in computer networks the entity is an individual flow, and the shared resource may be a bottleneck communications link or a server capacity. The entities can be served in some service increments, one at a time. For example, the service increment for a network flow is one packet (or cell, in the ATM terminology). A device, called the Scheduler, needs to determine the order of service for different entities so that average service rate received by an entity is its assigned rate R(i). Aside from guaranteeing the long-term average rate, an important goal is to bound the discrepancy between the ideal and the actual service times of each individual service increment, i.e., each packet of each flow.
An example of an environment where such problem occurs is a processor which must schedule jobs competing for its cycles. If all jobs are of equal importance, then it is desirable to provide all jobs an equal share of the processor capacity. If, however, the jobs have different importance, a possible strategy is to assign weights to all jobs corresponding to their importance, and provide each job a share of processor capacity proportional to the weight assigned to the jobs. In this case the desired service rates are determined by the weights of the flows. An alternative approach might be to assign rates to flows according to some other rule, which is specific to a particular policy and environment of the problem. For example, a rule might be to give some fixed allocation to high priority jobs and then share the remaining bandwidth among low priority jobs.
As mentioned earlier, another example when a similar problem might occur is in computer networks. For example, in ATM networks there is usually some rate associated with every flow traversing the network. This rate can be either the result of negotiation with the network at setup time, as for example for Constant Bit Rate (CBR) traffic, or can be the result of a traffic management feedback control scheme as is the case for Available Bit Rate (ABR) traffic. The set of rates can be either relatively static, as for long-term CBR flows, or may change quickly in response to congestion as in the case of ABR flows.
Even if the rates are not assigned explicitly, which is the case, for example, in many packet-switching networks, different flows may be of different importance. For example, one flow may be a compound flow of data from 1000 users, while another flow may represent a single user. It may be reasonable in such case to assign weights to different flows given their relative importance. If the total demand of flows exceeds the capacity of the bottleneck resource, typically a communication link, then a possible policy is to service the congested switch to all flows proportionally to their weights just as described earlier in the example of processor sharing. This effectively assigns rates to the flows.
In recent years, rate-based scheduling disciplines at the switching points in computer networks have received a lot of attention. A comprehensive review of such schemes can be found in Hui Zhang, Service Disciplines for Guaranteed Performance in Packet-Switching Networks, Proceedings IEEE, October 1995. These schemes generally are applicable at network switches and can guarantee rates assigned to the flows.
The problem of scheduling of different flows in computer networks exists not only for the switches in the network, but in host adapters as well. For example, an adapter in an ATM network must schedule different flows each having a rate associated with it. Typically, the CBR flows are serviced at a higher priority according to a pre-computed schedule. One of the disadvantages of pre-computing the CBR schedule is that because it is computed without taking any non-CBR flows into account, the service of non-CBR flows may be unnecessarily adversely affected by the CBR bursts. Pre-computing the schedule also has the disadvantage that it is computationally expensive and is usually done in software on a slow time scale. While this may be acceptable for CBR flows which only need to perform this once a new connection is established, it is not feasible if many flows with frequently changing rates need to be scheduled.
Another scheme that is known for rate-based scheduling is the so-called Leaky Bucket, described for example in The ATM Forum Traffic Management Specification Version 4.0. The scheme requires a large amount of per flow state and therefore is prohibitive for a large number of flows.
Also frequently used is the so called xe2x80x9ctime-wheelxe2x80x9d or xe2x80x9ccalendar queuexe2x80x9d approach. An example of the calendar queue approach may be found in Brown., R, Calendar Queue: A fast O(1) priority queue implementation for the simulation even set problem, Communications of the ACM, vol.31, pp.1220-1227. Unlike the Leaky Bucket scheme, the calendar queues are simple to implement. Unfortunately, in general the calendar queue approach cannot guarantee that the long-term average rate achieved by the flow is equal to its assigned rate.
Therefore, it is desirable to design a scheme that can be used for rate-based scheduling of flows with dynamically changing rates at networks adapters and can guarantee the assigned rate of the flow.
It is also desirable that this scheme can be used for CBR-type traffic (also known as guaranteed service in packet switching networks) and ABR-type traffic (also known as adaptive traffic) simultaneously, as well as VBR traffic (variable bit rate) in ATM networks (also known as predictive traffic in packet switching networks). Finally it is desirable that this scheme can be used in a more general context of rate-based scheduling as described earlier.
The approaches described in the paper by Hui Zhang for switch scheduling are not easily applicable to the adapters. One of the reasons for that is that most of the scheduling schemes for the switches rely on packet arrival times to the switch to determine the scheduling order of packets from different flows. The notion of arrival time is not always well-specified for the adapter, since typically the adapter requests data from the application when it is ready to transmit its data.
What is needed is a general approach to rate scheduling that will work in many different environments. In particular, the new approach should work well for network adapters as well as for network switches.
The new scheme described in this application, referred to as Relative Error (RE) Scheme can be used in particular for network adapters as well as for network switches to provide stringent rate guarantees. However, as mentioned earlier, its use is not limited to the field of computer networks. Unlike most of the schemes described above, which operate in the time domain, the RE scheme operates in the frequency domain. One of the advantages of it is that the necessity of rate-to-time conversion (which involves finding an inverse of a rate) is eliminated.
A method of scheduling a plurality of data flows in a shared resource in a computer system, each of the data flows containing a plurality of data cells, is provided including the steps of providing a scheduler in the shared resource, the scheduler having a plurality of link cell slots, initializing the scheduler to receive the plurality of data flows, receiving each of the plurality of a data flows in the scheduler, each of data flows containing a requested flow rate, scheduling, by the scheduler, each of the plurality of data flows such that a sum of each of the requested flow rates of each of the plurality of data flows is less than an available bandwidth in the shared resource and a relative error is minimized between an actual scheduling time and an ideal scheduling time on a per cell basis, and repeating the steps of receiving and scheduling.