The present invention relates generally to computer systems, and in particular to time-share scheduling performed by computer operating systems.
A variety of time-share scheduling schemes have been employed in computer systems to schedule time-share jobs competing for execution time on one or more central processing units (CPUs). Typically, these time-share scheduling schemes alter or adjust priorities in an attempt to achieve fairness. For example, in one such time-share scheduling scheme, referred to as a priority aging algorithm, a particular job starts with a priority in a middle range of priority values. As a selected job executes on a CPU, the selected job""s priority is decreased to make the job less attractive for future execution. Conversely, as a non-serviced job waits for CPU execution time, the priority of the non-serviced job improves to make the non-serviced job more attractive for future execution.
The priority aging algorithm operates well so long as the algorithm is confined to being based strictly on the priorities. Operating systems, however, typically take advantage of conditions such as cache affinity, memory affinity, and interactive response of certain jobs. These other conditions are most likely not related to the priorities assigned by the aging algorithm. Thus, to respond to these other conditions, the aging algorithm is violated, for example, to take advantage of a warm cache being used by a low priority job, which would otherwise cool down if the low priority job stayed away from the warm cache.
In order to take advantage of conditions such as cache affinity, memory affinity, and interactive response, operating systems operating with an aging algorithm typically reset a job""s priority to a relatively high priority value to assure that the job gets to run on the CPU. Because of the high priority assigned to the job, the job typically ends up with an overabundance or unfair amount of CPU time.
Another previous time-share scheduling scheme is based on a market model, where each job earns some amount of CPU time which is placed in a bank account. An auction is held and each job pays so much for CPU time or I/O time from the bank account. In this system, the job which bids the most gets to run on the CPU. This economic model has proved to be far too complex to practically implement in a commercial computer system. Such complexities include needing to calculate how much a particular job wants to spend, how much the job should reserve for future bidding, and other such complex factors needed to resolve the auction issues.
Another previous time-share scheduling scheme is a credit time-share algorithm, which involves a credit for CPU time based on a multiprocessor CPU system. A job is determined to be entitled to a selected number of CPUs of the multiprocessor system based on the job""s priority. If the job uses only a portion of the selected number of CPUs entitled to the job, the job acquires the unused portion as credits it can spend at a later time to get additional CPUs for execution.
Yet another previous time-share scheduling scheme is referred to as lottery scheduling. Lottery scheduling represents a randomized resource allocation mechanism. Resource rights of jobs are represented by lottery tickets. Each allocation of CPU (or I/O) resources is determined by holding a lottery. The CPU resource is granted to the job with the winning ticket. In this way, CPU resources should be allocated to competing jobs in proportion to the number of tickets held by each job. A job is awarded some number of tickets based on the job""s priority. Nevertheless, just as with the time-share aging algorithm, lottery scheduling relies on a mathematical model which is necessarily broken when the operating system places jobs in for execution based on factors or conditions other than those associated with the statistical model, such as cache affinity, memory affinity, and interactive response.
Therefore, for the reasons stated above, and for other reasons presented in greater detail in the Description of the Preferred Embodiments section of the present specification, there is a need for a time-share scheduling scheme which permits factors or conditions such as cache affinity, memory affinity, and interactive response to be used for scheduling without breaking the basic mathematical model of the scheduling scheme. In addition, there is a need for an improved time-share scheduling scheme which provides a precise method of achieving an accurate and fair assignment of time-share jobs over a long-term period, and at the same time, provides significant leeway in short-term scheduling decisions.
The present invention provides a method and a computer system having a time-share scheduling mechanism for scheduling multiple jobs in a computer system. Earnings are apportioned to each of certain jobs of the multiple jobs based on time each of the certain jobs spent in a queue requesting execution on a processor in the computer system and time each of the certain jobs ran on the processor. At the end of a time slice, a job is selected for execution on the processor based on earnings apportioned to each of the certain jobs.
In one embodiment of the present invention, earnings are allocated at times defined as scheduler ticks. At scheduler ticks, earnings are allocated to each of the certain jobs based on time each of the certain jobs spent in a queue requesting execution on a processor in the computer system. Earnings are subtracted from at least one of the certain jobs as the at least one of the certain jobs runs on the processor. Earnings are preferably allocated, at scheduler ticks, to each of the certain jobs proportional to priority weights assigned to each of the certain jobs.
In one embodiment of the present invention, a balance is calculated, at the start of each scheduler tick. The balance represents an amount of time available for running jobs in the processor over a previous period between scheduler ticks. At the end of scheduler ticks, earnings are allocated to each of the certain jobs based on the calculated balance and time each of the certain jobs spent in a queue requesting execution on a processor in the computer system over the previous period between scheduler ticks. The balance is preferably calculated, at the start of scheduler ticks, by summing all total earnings of the certain jobs. After the earnings are allocated, at the end of scheduler ticks, based on the calculated balance, the summation of all total earnings of the certain jobs is preferably substantially equal to zero.
In one embodiment of the present invention, the allocated earnings are calculated for a given job of the certain jobs substantially equal to minus balance multiplied by the given job""s time spent in a queue requesting execution on a processor in the computer system over a previous period between scheduler ticks multiplied by the given job""s priority weight divided by the summation of all of the certain jobs time spent in a queue requesting execution on a processor in the computer system over a previous period between scheduler ticks multiplied by the their corresponding priority weights.
In one embodiment of the present invention, earnings are preferably only apportioned to certain jobs which are runnable on the processor. In other embodiment of the present invention, earnings are apportioned to all jobs in the computer system. In addition, the earnings of each of the certain jobs is preferably limited to a maximum earnings value to avoid unbounded accumulation of time.
The time-share scheduling mechanism of the present invention preferably schedules jobs for execution on the processor based on conditions other than the selected jobs earnings, such as cache affinity of the selected job, the memory affinity of the selected job, and the interactive response of the selected job. The earnings are preferably not altered to schedule the selected job, but are decreased as the job runs on the processor.
The earning-based time-share scheduler of the present invention provides long-term stability by assuring that each job gets its fixed share of the CPU over time. Many conditions other than earnings of a job are taken into account when selecting a process to run on the CPU, such as cache affinity, memory affinity, and interactive response. While such short-term decisions may prevent a job from being run, the short-term decisions do not affect a job""s time-accumulation rate, which eventually dominates and forces the scheduler to give the job its fair share of the CPU. Therefore, the earning-based time-share scheduling mechanism of the present invention permits considerable flexibility in short-term scheduling decisions in order to improve throughput or response time, while maintaining fairness over the long term.