Most modern computer systems are multi-tasking systems. That is, they allow more than one "job" or "process" to be active at a given time. Since there is often only one processor in the system (or a number of processors less than the number of jobs/processes), it becomes necessary for the jobs/processes to share the processor resources. In a shared processor system, a processor spends some time executing part of a first job/process before switching to execute part of another job/process. Thus, to a human user, it appears that more than one job/process is being executed at a given time.
Some computer systems execute "multi-threaded" computer programs in which multiple "threads" of a process can be executing at the same time. Multi-threading adds an extra note of complexity to the operating system and to processor sharing.
In at least one implementation of the Solaris operating system (available from Sun Microsystems, Inc.), a highest priority job will run for a period of time (called a "time quantum") and then its priority is redetermined. There are currently four different scheduling classes that define the priorities of the applications in a conventional Solaris system: Real Time (RT); System (SYS); Interactive (IA), and Timesharing (TS). If, after its execution, a job still has the highest priority in the system, it is allowed to run for another period of time (e.g., between 20 to 200 milliseconds), after which the priority is redetermined again. If it is no longer the highest priority job in the system after the redetermination, then a job that has a higher priority gets to run. Unfortunately, if a job maintains a highest priority, other applications do not always get a chance to execute.
Many conventional operating systems use time-sharing processor-scheduling strategies that schedule threads independently. In such a system, each thread is assigned a priority and a time quantum and is put on a run queue to wait for its turn to use the processor. When a processor is available, it picks the thread with the highest priority from the run queue and executes it. The thread is allowed to use the processor for the duration of the time quantum. When the thread uses up its assigned quantum, the thread's priority is recalculated, a new time quantum is assigned and the thread is put back onto the run queue. The processor then picks the next highest priority thread from the run queue and begins to execute it again.
During its execution on a processor, a thread often requires an event to occur or some data or resource to be available before the thread can continue its computation. An example of an event might be a key stroke from a user or availability of a record from a file. Similarly, a thread may have to wait for availability of data, such as access to a lock that is shared with another thread. A thread that needs a resource is called a "consumer thread." When the resource or data is not available or when the event has not occurred, and the consumer thread needs to wait, the consumer thread is called a "blocked consumer thread." If the event or data is not immediately available, the processor does not simply allow the thread to wait idly while occupying the processor. Instead, even though the blocked consumer thread is not done with its time quantum, the processor puts the blocked thread on a sleep queue and picks another thread to execute. This new thread has its own preassigned time quantum. Use of a sleep queue allows a more efficient use of the processor resources. When the awaited event occurs or the awaited data or resource becomes available, the processor wakes the blocked consumer thread and puts it on the run queue to wait for its turn to use the processor.
Although the practice of allowing a thread to "sleep" while blocked allows a more efficient use of the processor, it incurs extra overhead because the processor has to take time to pick the next highest thread to execute. In addition, this approach can limit the performance of certain multi-threaded applications. Assume, for example, that a multi-threaded application has two threads: thread A and thread B, which share a common lock executing in a system with one processor. Thread A holds the lock and is waiting on the run queue for the processor. Thread B is executing on the processor, but requires the lock to continue. The lock is not available because thread A is holding it. Thus, thread B is blocked, to wait for the lock. When thread B is blocked, the processor picks a thread having a highest priority from the run queue.
Unfortunately, the thread that the processor picks may not be thread A. The thread on the run queue having the highest priority may not even be for the same process as threads A and B. If the processor picks a thread from a different process, the processor will have to perform an inefficient and costly context switch (to execute threads of a new process). When thread A finally does run, another context switch will be required (to execute threads of the process of threads A and B). Even worse, by the time thread A (which has the lock) executes, thread B (which needs the lock) may not be a highest priority process on the run queue and may not execute right away. This type of scenario demonstrates one reason why multi-threaded applications can perform poorly in a shared execution system.