Resources are often shared by a number of users. For example, telecommunication service providers have a certain limited number of resources that are critical to serve their customers' requests. A fair and efficient system for allocating the resources is desirable to prevent heavy users from degrading the performance of the system while allowing unused resources to be fully utilized.
The issue of guaranteeing a fair share of resources in a shared resource setting pervades our current business environment. It is widely recognized that sharing can in principle lead to economies of scale, and that non-simultaneity of demand (e.g., business vs. residential traffic) can be exploited. There is no integrated framework that guarantees users real time access to their contracted "slices" of a resource even in the face of unanticipated large demands from other customers. Protection can be achieved by complete partitioning of the resources. Such partitioning is effective in that users are protected from each other but is not efficient in that resources assigned to and not used by a particular user are wasted.
In a typical resource sharing situation, as exemplified by communications and computer networks, there is a need to address three major problems; namely, the problem of controlling the access to a system or subsystem, the problem of scheduling a particular processor, and the problem of allocating discrete resources.
To help visualize the problems and solutions, we will refer to an AT&T, Inc. 5ESS.RTM. switch. FIG. 1 depicts a 5ESS.RTM. subject to the demand of multi-class traffic, represented by streams a to z. Typically, a stream bundles the requests of a particular service, which can include, for example, call setups. For instance, a stream associated with wireless service may include handover requests, location updates, etc. A given activity places demands on a number of resources such as links, processors, data bases, trunks, etc.
The 5ESS.RTM. is engineered and provisioned so that it can provide a certain grade of service subject to a certain sustained traffic mix. Basic access control mechanisms could be implemented to guarantee that no class exceeds its contractual value. Such an approach would be effective, but hardly efficient. After all, access to a class that exceeds its allocated rate should not be denied when the system is operating below capacity because other classes are below their contractual rates. Traffic variability makes this a common occurrence. It is not unlikely for a class to exceed its contractual rate, and this does not necessarily happen when all classes are demanding their full share.
The second problem, i.e., the sharing of a processor, is local in nature. It may be necessary to control the sharing of a processor in addition to the global access control. For example, a particular processor may play a critical role as a global resource. Second, different classes tax various resources differently. By its very nature, a global access mechanism has to be based on estimates of the demand of each class of every resource. Moreover, a single class may contain in itself a number of activities. Therefore, the estimate assumes a certain mix of activities, but there is no guarantee that, once a class has obtained access, it will have such a mix.
Finally, we face the problem of sharing discrete resources. An example is the contention for trunks in the 5ESS.RTM..
U.S. Pat. No. 5,274,644 (the "'644 patent"; incorporated herein by reference) discloses several examples of methods for controlling access to a common resource, including a scheme that utilizes "tokens" to regulate access to the common resource. The '644 patent system includes a "throttle" associated with each user that regulates that user's access to the resource based on that user's access to tokens, as described below.
FIG. 2 depicts a rate control throttle for a single stream. The system can provide a desirable grade of service as long as key parameters of the traffic (arrival rate, peakedness) are kept within certain bounds. The rate control throttle guarantees that, in the long run, the admission rate in the system does not exceed a predetermined value, and shapes the traffic by limiting its peakedness. This predetermined value is used to establish a contractual admission rate "r.sub.i " for each user (i=1 to N, N being the total number of users), and a peakedness related factor "L.sub.i ". The mechanism consists of a bank B.sub.i for each user, of finite capacity L.sub.i, and a source (not shown) that generates tokens at a predetermined rate r.sub.i. The admission mechanism works as follows:
i) Tokens are generated at rate r.sub.i. PA1 ii) A freshly generated token is deposited in bank B.sub.i if the number of tokens already in the bank is less that L.sub.i. Otherwise, the token is destroyed. PA1 iii) An arrival from user i that finds a token in bank B.sub.i is admitted into service (and the token destroyed). Otherwise, the arrival is either rejected or queued. PA1 i) Tokens corresponding to class i are generated at the contractual rate r.sub.i, PA1 ii) A freshly generated class i token is deposited in bank B.sub.i, if the number of tokens already in B.sub.i is less than L.sub.i. Otherwise, the token is stored in the overflow bank B.sub.0. If the overflow bank happens to be full, the token is destroyed. PA1 iii) A class i arrival that finds a token in bank B.sub.i is admitted into service (and the token destroyed). If no token is available in bank B.sub.i, but the overflow bank is not empty, the class i arrival is also admitted into service (and a token is destroyed in the overflow bank). Otherwise, the class i arrival is either rejected or queued.
As shown above, the rate control throttle is an effective mechanism that can be generalized to the case of N classes of users by providing a separate bank for each class. This system is effective, but is not completely efficient. When several classes contend for a given resource, it is likely that some of them will be operating below their contractual levels. The rate at which tokens are destroyed for finding a full bank is an indication of resource underutilization (provided of course that the provisioning of resources, and the assigning of token rates is sound).
One proposal to solve the problem of underutilization, described in the '644 patent, is to add an extra bank "B.sub.0 ", called the spare or overflow bank. As any regular bank, the overflow bank also has a limited capacity. The mechanism operates as follows (see FIG. 3):
An article authored by Berger and Whitt, entitled "A Multi-Class Input Regulation Throttle", (Proceedings Of The 29th IEEE Conference On Decision and Control, 1990, pp. 2106-2111), examines the case in which arrivals that are not admitted are lost from the system. This article discusses exact blocking and throughput for the case of two Poisson streams, and approximate models for N&gt;2 classes. (This article is incorporated herein by reference.)
If arrivals that are not immediately admitted are queued and the number of classes exceeds two, access rules to the overflow bank by queued jobs need to be specified.
Yet another efficient multiclass admission control algorithm, homomorphic to the rate control throttle, is the leaky bucket mechanism, also described in the '644 patent. Instead of focusing on the availability of a certain amount of a material (e.g., a token) that flows in at a fixed rate, the leaky bucket mechanism considers the availability of space in a bucket that drains at a fixed rate.
In addition to N real buffers (queues Q.sub.1, Q.sub.2 . . . , Q.sub.N), the leaky bucket access mechanism consists of N+1 virtual buffers (leaky buckets LB.sub.0, LB.sub.1, . . . , LB.sub.N). Each class is assigned its own queue, and its own leaky bucket. While the size of the queues is infinite, the leaky buckets have finite capacity. We denote as L.sub.i the capacity of LB.sub.i, i=0, . . . N. The additional leaky bucket (LB.sub.0) is common to all classes. To be admitted into the system, a class i arrival has to deposit b.sub.i quanta of fluid in its own LB.sub.i or in the common LB.sub.0 if the former cannot contain the full quanta. Should the sum of the spare capacity in LB.sub.i and LB.sub.0 be less than b.sub.i the arrival waits in Q.sub.i. Fluid originated by class i traffic is continuously deposited in bucket LB.sub.i, and possibly in LB.sub.0, as space is made by the leaks as long as Q.sub.i is not empty. Access within a class is granted according to the FIFO (i.e., first in first out) discipline as soon as the required quanta for the head of the line arrival has been deposited. LB.sub.i, i.noteq.0 leaks at the contractual rate r.sub.i. When a bucket, say LB.sub.j, j.noteq.0 empties, its discharge capability at rate r.sub.j is immediately transferred to LB.sub.0. As soon as fluid is deposited in LB.sub.j the leak is returned to it. That is, LB.sub.0 does not have a leak of its own.