Processor set (PSET) arrangements have been employed manage processor resources in a multi-processor computer system. In a multi-processor computer system, the processors may be partitioned into various processor sets (PSETs), each of which may have any number of processors. Applications executing on the system are then assigned to specific PSETs. Since processors in a PSET do not share their processing resources with processors in another PSET, the use of PSETs renders it possible to guarantee an application or a set of applications a guaranteed level of processor resources.
To facilitate discussion, FIG. 1A shows a plurality of processors 102, 104, 106, 108, 110, 112, 114 and 116. In the example of FIG. 1A, processors 102, 104, and 106 are partitioned in a PSET 120, processors 108 and 110 are partitioned in a PSET 122, processor 112 is partitioned in a PSET 124, and processors 114 and 116 are partitioned in a PSET 126. An application 140 assigned to execute in PSET 120 may employ the processing resources of processors 102, 104, and 106 but would not be able to have its threads executed on processor 112 of PSET 124. In this manner, an application 142 assigned to execute in PSET 124 can be assured that the processing resources of processor 112 therein would not be taken up by applications assigned to execute in other PSETs.
However, when it comes to scheduling, the scheduling resources of the thread launcher, the thread balancer, and the thread stealer policies are still applied on a system-wide basis, i.e., across PSET boundaries. To elaborate, in a computer system, a scheduler subsystem is often employed to schedule threads for execution on the various processors. One major function of the scheduler subsystem is to ensure an even distribution of work among the processors so that one processor is not overloaded while others are idle.
In a modern operating system, such as the HP-UX® operating system by the Hewlett-Packard Company of Palo Alto, Calif., as well as in many modern Unix and Linux operating systems, the scheduler subsystem may include three components: the thread launcher, the thread balancer, and the thread stealer.
With reference to FIG. 1B, kernel 152 may include, in addition to other subsystems such as virtual memory subsystem 154, I/O subsystem 156, file subsystem 158, networking subsystem 160, and a process management subsystem 162, a scheduler subsystem 164. As shown, scheduler subsystem 164 includes three components: a thread launcher 170, a thread balancer 172, and a thread stealer 174. These three components are coupled to a thread dispatcher 188, which is responsible for placing threads onto the processor's per-processor run queues as will be discussed herein.
Thread launcher 170 represents the mechanism for launching a thread on a designated processor, e.g., when the thread is started or when the thread is restarted after having been blocked and put on a per-processor run queue (PPRQ). As is known, a per-processor run queue (PPRQ) is a priority-based queue associated with a processor. FIG. 1B shows four example PPRQs 176a, 176b, 176c, and 176d corresponding to CPUs 178a, 178b, 178c, and 178d as shown.
In the PPRQ, threads are queued up for execution by the associated processor according to the priority value of each thread. In an implementation, for example, threads are put into a priority band in the PPRQ, with threads in the same priority band being queued up on a first-come-first-serve basis. For each PPRQ, the kernel then schedules the threads therein for execution based on the priority band value.
To maximize performance, thread launcher 170 typically launches a thread on the least-loaded CPU. That is, thread launcher 170 instructs thread dispatcher 188 to place the thread into the PPRQ of the least-loaded CPU that it identifies. Thus, at least one piece of data calculated by thread launcher 170 relates the least-loaded CPU ID, as shown by reference number 180.
Thread balancer 172 represents the mechanism for shifting threads among PPRQs of various processors. Typically, thread balancer 172 calculates the most loaded processor and the least loaded processor among the processors, and shifts one or more threads from the most loaded processor to the least loaded processor each time thread balancer 172 executes. Accordingly, at least two pieces of data calculated by thread balancer 172 relate to the most loaded CPU ID 182 and the least loaded CPU ID 184.
Thread stealer 174 represents the mechanism that allows an idle CPU (i.e., one without a thread to be executed in its own PPRQ) to “steal” a thread from another CPU. Thread stealer accomplishes this by calculating the most loaded CPU and shifts a thread from the PPRQ of the most loaded CPU that it identifies to its own PPRQ. Thus, at least one piece of data calculated by thread stealer 174 relates the most-loaded CPU ID. The thread stealer performs this calculation among the CPUs of the system, whose CPU IDs are kept in a CPU ID list 186.
In a typical operating system, thread launcher 170, thread balancer 172, and thread stealer 174 represent independently operating components. Since each may execute its own algorithm for calculating the needed data (e.g., least-loaded CPU ID 180, most-loaded CPU ID 182, least-loaded CPU ID 184, the most-loaded CPU ID among the CPUs in CPU ID list 186), and the algorithm may be executed based on data gathered at different times, each component may have a different idea about the CPUs at the time it performs its respective task. For example, thread launcher 170 may gather data at a time t1 and executes its algorithm, which results in the conclusion that the least loaded CPU 180 is CPU 178c. Thread balancer 172 may gather data at a time t2 and executes its algorithm, which results in the conclusion that the least loaded CPU 184 is a different CPU 178a. In this case, both thread launcher 170 and thread balancer 172 may operate correctly according to its own algorithm. Yet, by failing to coordinate (i.e., by executing their own algorithms and/or gathering system data at different times), they arrive at different calculated values.
The risk is increased for an installed OS that has been through a few update cycles. If the algorithm in one of the components (e.g., in thread launcher 170) is updated but there is no corresponding update in another component (e.g., in thread balancer 172), there is a substantial risk that these two components will fail to arrive at the same calculated value for the same scheduling parameter (e.g., the most loaded CPU ID).
The net effect is rather chaotic and unpredictable scheduling by scheduler subsystem 164. For example, it is possible for thread launcher 170 to believe that CPU 178a is the least loaded and would therefore place a thread A on PPRQ 176a associated with CPU 178a for execution. If thread stealer 174 is not coordinating its effort with thread launcher 170, it is possible for thread stealer 174 to believe, based on the data it obtained at some given time and based on its own algorithm, that CPU 178a is the most loaded. Accordingly, as soon as thread A is placed on the PPRQ 176a for execution on CPU 178a, thread stealer 174 immediately steals thread A and places it on PPRQ 176d associated with CPU 178d. 
Further, if thread balancer 172 is not coordinating its effort with thread launcher 170 and thread stealer 174, it is possible for thread balancer 172 to believe, based on the data it obtained at some given time and based on its own algorithm, that CPU 178d is the most loaded and CPU 178a is the least loaded. Accordingly, as soon as thread A is placed on the PPRQ 176d for execution on CPU 178d, thread balancer 172 immediately moves thread A from PPRQ 176d back to PPRQ 176a, where it all started.
During this needless shifting of thread A among the PPRQs, the execution of thread A is needlessly delayed. Further, overhead associated with context switching is borne by the system. Furthermore, such needless shifting of threads among PPRQs may cause cache misses, which results in a waste of memory bandwidth. The effect on the overall performance of the computer system may be quite noticeable.
Furthermore, since the scheduling policies are the same for all PSETs, there may be instances when scheduling decisions regarding thread evacuation, load balancing, or thread stealing involve processors from different PSETs.
In other words, a single thread launching policy is applied across all processors irrespective of which PSET a particular processor is associated with. Likewise, a single thread balancing policy is applied across all processors and a single thread stealing policy is applied across all processors.
As can be appreciated from FIG. 1C, certain scheduling instructions from thread launcher 192, thread balancer 194, and thread stealer 196, such as those involving processors associated with different PSETs 198a, 198b, and 198c, must be disregarded by the dispatchers 199a, 199b, and 199c in the PSETs if processor partitioning integrity is to be observed. When such scheduling instructions are disregarded in order to maintain processor partition integrity within the PSETs, the threads are not scheduled in the most efficient manner, and the system processor bandwidth is also not utilized in the most efficient manner.