This invention relates to a method and apparatus for scheduling access to resources by activities.
There have been many scheduling mechanisms proposed that are based on an economic paradigm [3, 4]. Some of these incorporate the idea of using an auction to determine the next activity to be granted access to a resource. However, these proposals have dealt only with the high-level problem of allocating access to resources, and not with the low-level details of how those allocation decisions are carried out. In particular, these scheduling mechanisms have not been coupled with an efficient mechanism, i.e., one that does not incur significant overheads due to frequent communication and transfer of control between the activities contending for the resource and the arbitrator of the resource, for maintaining the relationships between activities"" bids and for determining which activity is the highest bidder.
Many of the previous mechanisms that have tried to incorporate an economic model have been concerned with distributed systems in which the high cost of communication dwarf any over head involved in scheduling unlike in parallel, tightly coupled systems, where scheduling overhead can be relatively more important [5,6,7].
Processor scheduling is a central and well-studied problem in computer system design. Today, multithreaded applications are common. A thread is an independently schedulable sequence of operations. A process is a collection of one or more threads of control executing within a single address space. Threads provide an abstraction that allows a programmer to express parallelism that can be exploited to enhance application performance. Most current operating systems support multithreaded applications in such a fashion that the operating system treats each thread as a separately scheduled entity. When threads are supported in this fashion, it is simple to enforce global scheduling policies to ensure that when a single thread blocks on some request to the operating system (e.g. a system call or page fault), the rest of the application is unaffected. Treating each thread as a separately scheduled activity will also allow different threads of the same process to have different priorities. Unfortunately, such threads are inefficient because they require frequent transfer of control between the application and operating system, and because they require that the operating system maintain a state for each thread.
User-level thread schedulers manage the threads of a particular process in a fashion that is local to the process; that is, the operating system is not involved in most scheduling decisions and maintains no per-thread state. This has the advantage of greatly reducing the overhead of supporting multiple threads. However, most user-level thread packages provide no interaction between the threads and thread scheduler and the operating system. As a result, any communication with the operating system (e.g., a system call or page-fault) blocks the entire process, even though there may be threads in the process that could make use of the processor while one thread blocks on a system call. Also, since the operating system is unaware of the user-level threads, from the operating system perspective all threads within a process have the same priority, and it is not possible to manage priorities across different applications.
Two user-level thread scheduling mechanisms that do provide operating system interaction, and in which a thread that blocks in the kernel does not block the entire process containing the thread, are that of the Psyche operating system [1] and the Scheduler Activations mechanism [2]. Psyche employs shared memory for communication between the kernel and user-level processes, but this memory is not used in Psyche to coordinate the scheduling of threads belonging to different user level processes. Psyche allows a user-level process to yield a processor when it has no more ready threads. Psyche arbitrates priorities among user-level processes, but Psyche cannot arbitrate priorities among the threads of different processes.
As does Psyche, Scheduler Activations arbitrate priorities within a user-level process. The mechanism allows threads to migrate between processors in a multiprocessor system to ensure the execution of the highest priority threads of a user-level process spanning multiple processors. The authors comment parenthetically that their system is capable of arbitrating between threads with associated globally meaningful priorities, but do not discuss a mechanism for achieving this.
What is needed in a user-level thread scheduling system is a model and mechanism that enable a user-level scheduler to make an efficient local scheduling decision that is globally the correct decision, that is, the same decision would be made if there was access to all priority information.
An exclusive resource such as a computer""s central processing unit is a resource that can be used by only one activity at a time. In a system with multiple concurrent activities, access to an exclusive resource must be arbitrated between activities that require access to the resource. This invention applies to systems in which activities place potentially time-varying numeric values (these values, whether time varying or constant, shall be referred to as bids) on the importance of accessing the resource, and in which at any given time, some policy is used to arbitrate between the competing bids to assign access to the resource. In addition, this invention applies to systems in which activities are to be charged for access to the resource, where the amount an activity is charged depends on both the importance it places on accessing the resource, and on the demand on that resource by other activities. These charges do not necessarily correspond to any real monetary value (although they may); in general, they are a means of enforcing a policy for allocation of rights to access the resource.
This invention is a mechanism by which activities (possibly including a designated scheduler activity) are able to cooperate to make scheduling decisions while exchanging a minimal amount of information in an efficient manner. Most decisions made by an activity are based only on local information, that is, information available to the activity without having to initiate communication with any other activity. The key piece of information, the xe2x80x9cnext-bidxe2x80x9d, identifies the highest bid for a resource by any activity other than the activity currently using the resource. The latter activity, the one using the resource, will also be referred to as the xe2x80x9ccurrently active activity.xe2x80x9d When the need for the resource changes, the next-bid is used by a currently active activity to determine whether it should release the resource to another activity. Also, in a system that charges activities for their resource usage, the activity that accesses the resource is charged according to the next bid (as in a second-price auction) for each unit of time that it accesses the resource; that is, the value of the time used is assessed according to the bid of the activity waiting for the resource. The next bid is posted in memory available to the activity using the resource, the facility charging for the resource, and the entity scheduling the resource, allowing this information to be accessed without incurring any communication costs.
When an activity""s valuation of the resource changes, the activity compares its new bid to the next bid. This can be done locally, without incurring the overhead of communicating with any other activity or scheduler. As long as the bid of the activity that has current access to the resource is at least as great as the next bid, no rescheduling action is needed. The currently scheduled activity continues to use the resource, and in a system that charges activities for their resource usage, is charged as described. For example, in a computer system, where high priority processes are assigned higher numbers, then each of the processes is charged based on the priority of the second highest bid multiplied by the time the process has had access to the resource. Only when the activity""s bid falls below the next bid does the activity need to relinquish the resource. Unless it does so, it will be charged according to the next bid in a system that employs a charging scheme, which may limit its ability to access the resource at a later time, when its need to access the resource truly is greater than of any other activity. Similarly, when an activity that is not currently accessing the resource changes its bid (or its bid is changed on its behalf by some other entity), the new bid can be compared to the bid that is posted by the currently scheduled activity (currently active). Only when the new bid exceeds that of the activity accessing the resource does that activity need to be preempted.