Conventional computer systems typically share resources, such as disk drives and communication ports, for example. For a shared resource, a conventional queuing or scheduling mechanism allocates the resources to requesters, such as processes or threads, of the shared resource. In a conventional multiprocessing or multiprogramming system, there may be many requestors potentially competing for the same shared resource. Often, a requestor must wait for availability of a shared resource before continuing execution. Accordingly, efficient allocation of shared resources to requestors is desirable in such a computer system.
In a multiprocessor computer system, there are typically multiple processing devices that are each capable of executing sequences of instructions, called threads, concurrently with other processing devices. The individual processing devices in a conventional multiprocessing computer system may be separate microprocessor chips interconnected via a data bus or other circuitry, or the processing devices may reside as respective “cores” (e.g., processing circuits) on a single “die” (i.e., one physical microchip). If each respective processing device is a separate physical chip, each chip may include instruction processing circuitry as well as an on-board memory or local cache and an associated cache controller. Examples of multiprocessing computer systems are workstations manufactured by Sun Microsystems of Palo Alto, Calif., USA that can contain as many as 256 Scalable Processor Architecture (SPARC) processors. An example of a multiprocessing computer chip containing multiple processing cores on a single processor die that share a common cache is the Intel Pentium-4 line of microprocessors containing Hyper-Threading technology. Pentium-4 and Hyper-Threading are registered U.S. trademarks of Intel Corporation.
Such conventional computer systems, therefore, typically execute a number of concurrent threads. A thread may execute on one processing device for one period of time until thread preemption occurs. When the kernel schedules that thread for execution again, the kernel may execute that thread on different processing device than the original processing device that formerly executed that thread. This is called thread migration. Thread migration may happen, for example, if at the time of scheduling the thread for re-execution, the original processing device is now busy executing another thread but the different processing device is now available for execution of a thread.
The conventional threads, therefore, perform tasks as defined by the sequence of instructions in the thread. By executing and preempting the threads according to a scheduling mechanism or algorithm, the kernel ensures that each of the concurrent threads receives an appropriate amount of execution time. The cycles of execution and preemption are typically referred to as scheduling, and such cycles strive to maximize the use of the processing device by ensuring that each thread does not remain idle.
Accordingly, the scheduling typically attempts to provide an optimal period, sometimes referred to as a “slice” or “quantum” of processor resources, to each thread in a rotating manner. When an optimal period of execution time has occurred, the scheduler performs a context switch to suspend execution of the current thread and provide execution (processor) resources to another thread. One rudimentary scheduling algorithm simply allocates a fixed time “slice” to each thread, and performs a context switch when the time slice elapses. Typically, a system scheduler permits a thread to run on a processor until (a) its quantum is exhausted and the thread is preempted, permitting some other eligible thread to run, (b) an interrupt makes a higher priority thread ready, preempting the thread, or (c) the thread voluntarily blocks. It should be noted that (a) and (b) above are both cases of preemption—the thread is involuntarily descheduled (removed from the processor). When a thread is preempted it remains eligible to run. Scenario (c) is voluntary, and typically involves locks, waiting for IO, etc. In the case of (c) the thread doesn't remain eligible to run. Some other action, such as the IO completing, or some other thread releasing a lock, must occur, at which time the previously blocked thread is again made eligible to run. In general, therefore, schedulers identify, for example, when a thread has attained a convenient point in processing. One example of such a convenient point is a waiting or polling operation for a shared resource, such as a storage device or transmission gateway.
Therefore, conventional scheduling algorithms typically identify when a thread is waiting for a mutual exclusion lock held by another thread with respect to a shared resource to be released, resulting in a so called “blocked” thread. Since a blocked thread cannot perform further processing until it attains the blocking resource (i.e. when the current owner performs a disk write, for example), such a point denotes an optimal time at which to deschedule the thread and perform a context switch to a thread which is ready to continue processing. Therefore, conventional threads may execute until attaining a point at which the thread requires a resource which is not immediately available, waiting for such a mutual exclusion lock on a shared resource, at which point the thread becomes blocked pending availability of the resource. Such requests for a mutual exclusion lock are typically queued and addressed according to a FIFO or other queuing manner, thereby permitting the blocked thread to continue upon satisfaction of the shared resource request.