Most existing operating systems running on multi-processor platforms use a priority-driven, preemptive scheduler in which the highest-priority ready thread runs on any available processor (based on the thread's affinity). Unless a higher priority thread preempts it, a thread will typically ran for a quantum (length of time, or time-slice, a thread is allowed to run before another thread of the same priority runs). Some threads will voluntarily leave the running state prior to their quantum end in order to wait for another operation to complete. This scheme is typically meant to optimize throughput.
The operating system selects the thread to run. Various operations may occur based on how long a thread has been running. In existing systems, the priority of a thread may be dynamically reduced after it has run for a predetermined amount of time. A higher priority thread may trigger an interrupt to cause the other threads to be swapped out and the higher priority thread to run.
If the thread is a long-running thread, also known as a processor-bound thread, that is cache intensive, by being preempted, the thread might be moved to another core that requires another cache. The thread may also be preempted by a short-running thread that only runs briefly. It may not be desirable to preempt the longer running thread that is cache intensive.