The following computer program listing including Universal Modeling Language (UML) diagrams showing instructions, regulation of work flow, the detailed design of the scheduler and relationships between objects in the computer program is submitted on a compact disc and is incorporated herein by reference:
1. Field of the Invention
The field of the invention relates in general to a scheduler for allocating digital device bandwidth resources. More particularly, the field of the invention relates to a system and method for allocating and scheduling digital device bandwidth resources among users and groups of users by using an active feedback estimator which guarantees a system bandwidth requirement among competing users and aggregate user groups.
2. Background
Computer Operating Systems are growing in complexity. Most new computers and devices using computational devices are performing many different tasks simultaneously. This sharing of resources affects machine performance and user satisfaction. Typically the operating system (OS) is responsible for scheduling when each process runs. When a process has reached its allotted time slice, or blocks for some reason, the OS saves its state and runs the next runnable process in the process queue. This gives the appearance of many different programs running simultaneously on a single processor. This is currently extended to multiple processors, which handle even more processes simultaneously. This in turn requires more programs demanding more system resources running simultaneously.
The two main process scheduling classes are real-time and time-sharing. Current process schedulers have a basic overall goal; to make the xe2x80x9csystemxe2x80x9d more productive. The achievement of that goal requires the balancing of the following different requirements:
allocate a fair share of CPU time to each process;
keep the CPU busy 100 percent of the time;
minimize the response time for time sensitive processes;
minimize total job completion time;
maximize total number of jobs per time unit.
To achieve this balance, schedulers make some assumptions about the type of processes that are running. The default system scheduler may frequently not be adequate to suit the application. Many scheduling algorithms and their implementations have been developed over the years to shore up deficiencies in attempts to build better schedulers.
Some operating systems use a two-level thread implementation, where threads within a process are first scheduled to a virtual processor which in turn are scheduled on the physical processors. Thus, the threads are bound to a particular virtual processor and the scheduling characteristics are resultants of the scheduling of the virtual processor.
The OS community has developed many optimal scheduling algorithms, but all algorithms are optimal only for certain workloads. There is currently no single solution that fits every system need. General-purpose operating systems face the problem of developing schedulers that are general enough to solve most needs, yet extensible enough to handle specific workloads.
There are many different algorithms for choosing the next process to run. Two of the most common scheduler algorithms are First-In-First-Out (FIFO) and Round Robin (RR).
FIFO runs each process until its completion and then loads the next process in the queue. FIFO can have negative affects on systems when processes run for an extended time, squeeze critical resources that otherwise keep the system stable to the point of no return, and then crash the system or initiate a sequence of events to crash the system.
Round Robin schedulers allow each process at a given priority to run for a predetermined amount of time called a quantum. When the process has run for the allotted time quantum, or if a higher priority process becomes runnable, the scheduler halts the process and saves its state. It is then placed at the end of the process queue and the next process is started. This may be the most optimal solution if the time quantum is longer than the average runtime of a process and the only resource needed is CPU bandwidth. These are very unlikely to be the case in the typical process load mix and therefore the need for a better algorithm to fill all requirements and to smooth out the workload over available resources. Also, context switching between different processes has a price and we much consider the overhead in context switching time of changing processes in conjunction with the more complex scheduling algorithms.
OS developers have created multilevel queue scheduling to meet the needs for different algorithms at different times. Multilevel queue systems have several algorithms used simultaneously. Each queue is assigned a priority over the next queue. The scheduler starts at the highest priority queue, implements the queue""s algorithm until no runnable processes remain, and then proceeds to the next priority queue. One queue could use FIFO with another uses RR.
When implementing a conventional real-time scheduler, a problem known as priority inversion frequently arises. This occurs when a higher priority process is blocked waiting for a resource locked by a lower priority process. This problem has only partial solutions. One is by implementing a priority inheritance protocol. A process that xe2x80x9cownsxe2x80x9d a resource runs at the priority of the highest priority process that is awaiting that resource requested by a higher priority process. When the lock is released, the higher priority process becomes runnable and pre-empts the current process. This means that the algorithm selected can affect the performance of the application, and choosing the right algorithm can improve the system performance. What is needed is a system for setting priorities and quanta which are more likely to be the ones needed for the type of job mixes, users and applications which are presently emerging; and which will provide higher performance and more consistant service.
Whatever algorithm is used, changing the scheduling class of a server application, can guarantee that the server application will run before other applications and thus improve responsiveness for that particular application. However, this does nothing to guarantee that a particular application will only consume a specified portion of the CPU load, only that the application will be treated preferentially in allocating resources. What is needed is a way to track historic usage of an application or group of applications and guarantee that over a certain time period, the use of bandwidth resources is precisely that which was set initially, unless there is an available abundance of bandwidth, and no such constraints need be applied.
The general purpose OS will not function if the scheduler is not configurable or even modifiable. Most OS vendors have compensated for this problem in several ways. Some provide utilities to view the default scheduling classes and allow changing the process priorities and quanta. In UNIX, these types of utilities have names like xe2x80x9cnicexe2x80x9d, xe2x80x9csched_setschedulerxe2x80x9d or xe2x80x9cpriocntlxe2x80x9d. Special privileges are generally needed to run a process at a higher priority or change the scheduler for a process from the real-time class or the time-sharing class. These generally manipulate the real-time dispatcher parameter table or time-sharing dispatcher parameter table. Servers and computers in general have grown bigger and more powerful.
Conventional solutions have thus grown disadvantageously larger in granularity and they lack the precision to handle the large resources that they command. The granularity of the changes can very often make precise goal achievement impossible with these solutions. What is needed are methods to allow users increased access to and more precise control over the allocation of CPU resources than is provided by the conventional xe2x80x9cnicexe2x80x9d and xe2x80x9cpriocntlxe2x80x9d mechanisms. What is also needed are CPU and resource bandwidth scheduling utilities, applying the extension of the concept of share allocation down to threads and processes, thereby enabling users more precise control over the allocation of CPU access to these entities than is provided by the conventional xe2x80x9cnicexe2x80x9d and xe2x80x9cpriocntlxe2x80x9d mechanisms.
Rescheduling is done to optimize resources and to increase performance for a particular job load mix or to improve the performance for a particular application. The changes are often done at the expense of other jobs in the mix, as they often take whatever resources are not being used by the preferentially treated application. As stated above, selecting the right scheduling algorithm for the particular situation is critical in satisfying competing needs and objectives. Managing resources fairly with overly constricting requirements, is a dynamic function currently done with static methods. What is needed is a way to improve control of the scheduler to manage bandwidth resources and provide precise mechanisms which are capable of more closely tracking load requirements.
Fair Share schedulers emerged in the 1980""s to provide a more equitable approach to dispensing system resources to a mix of users and groups of users. This led to several changes in approach to Time Sharing (TS) schedulers.
While it is the norm for schedulers to use decayed CPU usages, Fair Share""s application of decayed resource usage in charging was a departure from traditional approaches. When a machine was solely for in-house use, the only need for a raw (undecayed) resource consumption tally is in monitoring machine performance and throughput and to observe patterns of user behaviour. The decayed usage is also normalized by the user""s shares. One might view this as making the machine relatively cheaper to users with more shares. In essence, Fair Share schedulers attempt to keep the actual machine share defined by normalized usage the same as the machine entitlement defined by shares. From the user""s point of view, Fair Share gives a decreased response to those who have utilized more than their fair share of resources. Thus, users see that as their normalized usage increases, their response becomes worse. (This assumes allowance is made for machine load.) This approach contrasts with conventional charging and scheduling systems that schedule processes equally. In the fixed-budget model, the users who consume their fair share, by emptying their budgets, get no resources, even if resources are available. In the extreme case, there may be no users because everyone who wants to use the machine has an empty budget. For an in-house machine, this does not make sense. Worse still, this conventional method can generate substantial administrative overhead as users seek extra allocations. The number of shares allocated to a user is, essentially, an administrative decision. However, in a situation where independent organizations share a machine, the shares that should be allocated to individual users depend both upon the entitlement that their organization has and on the individual""s entitlement within the organization.
Additionally, charges in Fair Share (FS) are defined by the relative costs of different resources. For example, FS associates a charge with memory occupancy, another charge with systems calls, another with CPU use and so on. This is another difference between FS and conventional schedulers which define a process""s scheduling priority only on the process""s consumption of CPU time. In FS, CPU scheduling priority is affected by total resource consumption.
There are three types of activity at the process level:
activity associated with the activation of a new process;
the regular and frequent adjustment of the priority of the current process, and
the regular, but less frequent decaying of the priorities of all processes.
The first activity occurs when a process relinquishes control of the CPU, or when the active process is interrupted for some reason, and at the regular times that the scheduler usurps the currently active process to hand control to the highest priority process that is ready to run. Next is the adjustment to the priority of the current process, which defines the resolution of the scheduler. This ensures that the CPU use of the current process decreases (worsens) its priority. Finally, there is the regular decaying of all process priorities, which must be done frequently compared to the user-level scheduler, but at a larger time interval than the scheduler""s resolution.
At the finest resolution of the scheduler, the current process has its priority increased by the usage and active process count of the user who owns the process. Typically, schedulers increase the priority by a constant. Intuitively, one might view the difference between FS and typical schedulers as follows:
A typical scheduler adjusts the priority of the current process by pushing it down the queue of processes by a constant amount. In contrast, FS pushes the current process down the queue by an amount proportional to the usage and number of active processes of the process""s owner, and inversely proportional to the square of that user""s shares. Thus, processes belonging to higher usage (more active) users are pushed further down the queue than processes belonging to lower usage (less active) users. This means that a process belonging to a user with high usage takes longer to drift back up to the front of the queue. The priority needs longer to decay to the point that it is the lowest.
FS also needed users to be able to work at a rate proportional to their shares. This means that the charges they incur must be allowed to increase in proportion to the square of the shares (which gives a derivative, or rate of work done, proportional to the shares). This static formula approach also takes account of the number of active processes (processes on the priority queue) for the user who owns the current process. This was necessary since a priority increment that involved just usage and shares would push a single process down the queue far enough to ensure that the user gets no more that their fair share. If the user has more than one active process, FS needed to penalize each user to ensure that the user""s share is spread between them and we do this by multiplying the priority increment by the active process count. This is the crux of the Share mechanism for making long term usage, over all resources that attract charges, affect the user""s response and rate of work.
Depending on the priority queue length, FS process priorities can be small integers and so cannot be used directly. Such process priorities need to be normalized into a priority range that is appropriate for real process priorities. In addition, where the range in priority values is quite small, FS must ensure that the normalization procedure does not allow a single very large Share priority value to reduce all other normalized priorities to zero. To avoid this, FS defines a bound on the Share priority. This is calculated in the process-level scheduler. The FS priority bound does, somewhat unfairly, favor very heavy users. However, the heavy users suffer the effects of their slowly decaying large usage and they are treated more severely than everyone else.
The Fair share schedulers try to apply xe2x80x9cfairnessxe2x80x9d by assigning priorities on historical usage of a process. The problem is that those Fair share schedulers employ less than robust methods in using the measured historical usage and therefore err considerably in the setting of process priorities. What is therefore needed is the use of a resource usage rate measure model the can estimate future usage and compare that directly with entitlements based on reservations and shares.
Recent market changes, advances in general computational devices and growth in market applications have also created the following needs which are not adequately met by conventional schedulers:
Guarantees must be provided to consumers when it is desired to ensure that resource use is fairly shared among clients according to the level of service that they have purchased.
New wireless applications and increased use of time based and bandwidth resources demand a greater level of system control over bandwidth resource allocation.
Therefore, what is needed is an improved system and method for balancing the foregoing additional needs in a scheduling process for managing resource usage. Preferably, such a scheduling system would allow the sharing of system bandwidth type resources on modern computer servers in new market applications which demand dedicated resources without robbing other system users. What is also needed is an increased level of system control over resource allocation such that the increased granularity of control can provide a more equitable distribution of designated bandwidth resources to processes across users and groups of users. Preferably, such implementations will be well defined and have a well behaved mathematical basis.
Although computing devices are becoming more powerful and need to managed differently between user groups, as with most operating systems and schedulers, bandwidth resource waste occurs if the job mix is not in tune with the scheduling algorithm. What is needed is a way to set priorities to satisfy demand in a manner which will not starve other processes and will not waste CPU cycles or other bandwidth resources.
With the increase in computing power also comes shrinking hardware demands with the need to support larger more complex applications and application combinations. For example, the merging of PDAs with cell phones, GPS, locators, music, radio, wireless apps, TDMA/CDMA, or the like. These trends demand more efficient usage of sophisticated applications working in tandem. What is needed is a method to generate more usage out of a computing system with various bandwidth resources, thereby providing a way for a smaller, more compact computing system to meet the demand equivalent of a larger less compact computing system employing a less efficient scheduler or loosely coupled schedulers.
An aspect of the invention provides an implementation of a digital device resource scheduler which determines the relative priority to be used when allocating resource service time to resource using entities.
The priorities produced by the scheduler can be used to increase the usage of entities that are receiving less than their entitlement and to decrease the usage of entities that are getting more than their entitlement. Provided that entities have sufficient demand to use their entitlement, the scheduler distributes resources fairly among entities according to their entitlement. An entity""s usage may be calculated using Kalman filter techniques. Separate evaluations may be made depending on whether the entity has been receiving units of the resource. In this case, the decay and growth terms are totally independent of each other. The only information that may need to be retained for each entity is the value of the usage metric for that entity and the time at which it was last updated. Shares may be allocated to owners of the entities as well as the entities themselves. Thus, an entity""s effective shares are adjusted by the ratio of the number of shares held by the owner that are allocated to that entity, to the total number of shares held by that owner. The shares allocated to individual entity are now effectively drawn from a separate pool of shares for each owner. Shares may be allocated to groups of owners, with or without the allocation of shares to individual owners.
An aspect of the invention also provides the ability to dynamically monitor and control a computer""s critical system bandwidth resources such as CPU, Real and Virtual Memory, Bandwidth Allocation,or the like. Unlike passive systems, the invention scheduler continually and automatically compensates for changes in system usage/misuse in real time. An aspect of the invention directly alters the behavior of the Operating System to actively manage and change, rather than passively monitor and report, the precise allocation of CPU and other resources to defined users, collections of users, and applications.