1. Technical Field of the Invention
This invention pertains to a system scheduler. More particularly, it relates to scheduling to users the resources to which they are entitled in a dispatch driven multiprocessor system.
2. Background Art
Schedulers are provided in operating systems to control the amount of system resources consumed by users. Co-pending U.S. patent application, Ser. No. 08/252,864 filed Jun. 2, 1994 by G. A. Davison describes such a system.
Often in a computer system, many programs or users require execution concurrently. There may be one or more processors available to execute the multiple programs. Any one processor can only execute one program or transaction at any one time. If two or more programs request execution at the same time, and there is only one processor or fewer processors than the number of programs requiring execution, the operating system must determine which program can use the one or each processor at any one time. Typically this decision is based on a relative priority of each of the programs. Typically also, programs which are in a wait state, such as those waiting for a slow I/O operation to complete, are xe2x80x9csuspendedxe2x80x9d and therefore not eligible to compete for the processor regardless of their priority.
In addition to deciding which program can utilize the one or each processor at a particular time, the operating system must also determine how long the selected program can utilize the respective processor after it begins to execute so that other programs of equal (or perhaps lower) priority can get their turn. There are several known techniques for determining how long each program should be permitted to utilize a processor after it begins to execute. In the simplest technique, the highest priority program maintains its use of the processor until completion (or suspension). However, this may permit this program to xe2x80x9chogxe2x80x9d the processor, and xe2x80x9cstarvexe2x80x9d the other programs. In another known technique, each program which is selected for execution is given a share or xe2x80x9cslicexe2x80x9d of the total processor time. The shares of the programs can be equal or preferably weighted based on various factors. One known technique to weight the shares is as follows. Some of programs are defined to command an xe2x80x9cabsolutexe2x80x9d or definite share of the total processor time, and others are defined to command a xe2x80x9crelativexe2x80x9d share of the remaining processor time. The absolute shares are xe2x80x9ctaken off the topxe2x80x9d and the relative shares divide the remainder in proportion to their relative shares. For example, a first program is defined with a forty percent absolute share, a second program is defined with a one hundred relative share and a third program is defined with a three hundred relative share. In this scenario, the first program will command forty percent of the total processor time, the second program will command fifteen percent of the total processor time and the third program will command forty five percent of the total processor time.
A program may not need its entire absolute or relative share. A known technique to allocate the excess in this absolute/relative share environment is to give all the excess to the program with the highest absolute or relative share. If the excess were a small percentage of the total processor time, this known technique would suffice. However, in many cases, the excess is substantial, such as one half of the total processor time, and it may not be optimum to allocate all the excess to one program.
The relative share of the system that a user is entitled to is based on a proportion of the priorities of all the relative share users who are presently competing for system resources. However, all of the system resources are not necessarily available to the relative share users. If there are absolute share users competing for resources, those absolute share users are given their part of the system first before the remaining share of the system is divided proportionally among the relative share users. These relationships may be represented, generally, as follows:
remaining share=100% of the systemxe2x88x92total of all absolute shares
Then the remaining share is divided proportionally among the relative share users:
relative user""s share=remaining share*(user""s relative share/total share of all relative users)
A minor time slice is the amount of processor time that a given user is allowed to have without interruption for reexamination by the scheduler. The size of a minor time slice is dependent on machine speed and possibly customer setting. When a user is dispatched, a CPU timer is set with the amount of time in a minor time slice. This timer decrements and presents an interrupt when it runs out, thus signaling to the scheduler that the user""s minor time slice has ended.
A limited user, i.e., xe2x80x98limithardxe2x80x99 user, is one for which a limit value is defined representing the maximum amount of processor resource to which it is entitled. This limit value represents a maximum usage share, and is used to calculate the dispatch priority for the user. Typically, a limit list is provided on which are placed those limithard users which exceed or are about to exceed their maximum allowed share of the processor resource. While a user is on the limit list (or otherwise characterized by a limit status), resources are not dispatched for its use.
An artificial time-of-day (ATOD) clock is used in dispatching user work. The ATOD value increases at the same rate as the machine""s time-of-day (TOD) clock but only during the time that user work (as distinguished from non-user work, such as operating system work) is being done by the machine. In current systems which rely, at least in part, on ATOD to schedule systems resources, a limited share user may not be allowed to achieve its maximum usage share even when CPU resource is available. Apparently, in a low load situation with idle processor resource time, the artificial time-of-day (ATOD) value does not move fast enough during the idle time. Therefore, dispatching priorities of those limithard users that are running move faster than they should relative to ATOD and they enter the limit list (that is, given limit status) prematurely. Furthermore, users may be unnecessarily retained on the limit list because only active (non-idle) processors monitor the list. In a low load situation (a time when there is not enough work to keep all the processors busy), fewer processors are active so the limit list is monitored less frequently. This creates an increase in the time between when a user should leave the limit list and when a processor detects that it should leave. Heretofore, there has been no mechanism to remove this user from the limit list during the time that a processor is in the active wait state. In this situation, there is time when processors have no user work to do and are therefore looping in the active wait mechanism, and a user on the limit list remains there until something happens on another processor that would cause the limit list to be checked.
Thus, when a current system such as that described in Davison, supra, has very few users on it, or when it is not busy, its scheduler gives users less CPU time than their hard limits when they should be able to have usage up to the hard limit. In a case where there are very few users on the computer system, oddly enough the user gets noticeably less CPU time, up to 20% less, than entitled. This may result, particularly where CPU time is sold, in complaints from users that they are not getting all of the CPU time for which they are paying. Conversely, CPU time may not be charged when it could be.
It is an object of the invention to provide a scheduler system and method to facilitate allowing a user access to system resources closely approaching or equal to its hard limit in a low load situation.
It is a further object of the invention to provide an improved scheduler which facilitates optimal charging for system resources and reduces user discomfort over the amount of the charges or the availability of such resources.
It is a further object of the invention to provide a system and method whereby a monitor built into an active wait routine is used by system scheduling algorithms to determine a user""s priority and to determine which user should be allowed to run next.
It is a further object of the invention to provide a system and method for scheduling system resources to users having limited access by placing such users on a limit list not too soon, and removing them from the limit list not too late.
It is a further object of the invention to provide a system and method for scheduling system resources to users having limited access by enabling scheduling of resources not too late and inhibiting scheduling of resources not too soon.
System resources are scheduled for use by a user by maintaining a first clock; maintaining a second clock; during a dispatch process, if there is work to do, scheduling use of system resources to the user in dispatch priority order with other users only when the user is not in wait status; otherwise advancing the first clock by user time; and advancing the second clock by user time and wait time; and during a wait process, if there is work to do, calling the dispatch process; otherwise, timing wait time; and if there is a user in wait status upon the wait time exceeding a predetermined value, incrementing the first clock by the wait time and calling the dispatch process.
Other features and advantages of this invention will become apparent from the following detailed description of the presently preferred embodiment of the invention, taken in conjunction with the accompanying drawings.