The broad availability of multi-core microprocessors has made multi-threaded software applications commonplace. In such applications, a program spawns multiple execution “threads” that run in parallel on different central processing units (CPUs) or different “cores” in the microprocessor in order to accelerate computation. An operating system (OS) typically manages different threads executed on the microprocessor. In particular, an OS can decide how threads are bound to processor cores and for how long a thread may execute on a processor core.
In general, threads may be “sleeping” or “active”, depending on whether or not they are executing on a processor at a given moment. Multi-threaded applications typically place their execution threads into a sleeping state when the threads are not required to do work. The OS can wake up sleeping threads when work is available. In modern microprocessors, it can take between 20 and 60 microseconds, on average, for the OS to respond to a wake up request and bring a sleeping thread into the active state. Thus, there must be a sufficient amount of work for the thread to perform in order to justify this wake-up time overhead. Otherwise, the overhead contributes to inefficiency, potentially eliminating any gain achieved by multi-threading.
In some cases, a thread does not transition to a sleep state when there is no work to perform, but rather enters a loop to stay active (referred to as “spinning”). The thread “spins” until there is work to be performed. In general, spinning requires the thread to repeatedly check whether there is work to do. However, such spinning is not desirable from the perspective of efficient processor usage, since the processor is consumed in maintaining the spinning thread. Thus, a processor maintaining a spinning thread at best has a reduced capacity to handle more meaningful tasks, and at worst is unavailable to perform such tasks while the thread is spinning.
Accordingly, there exists a need in the art for an improved method and apparatus for providing a thread synchronization model that overcomes the aforementioned disadvantages.