In computer system design, a “thread” is a sequence of instructions being executed by a processor. In many computer systems, multiple threads may be processed in parallel. Such “multithreading” may be supported in (i) a system having several processors (a “multiprocessor system”), each capable of processing one or more threads, or (ii) a system having a single processor (a “uniprocessor system”) capable of processing multiple threads.
When designing a multithreaded system, it is important to implement mechanisms for handling situations in which two or more threads attempt to concurrently access a particular shared resource (e.g., a shared memory location). Such implementations often involve the use of “locks,” which are constructs that may be used to protect particular shared resources. In such cases, a thread needing access to a shared resource must “acquire” or “obtain” the lock protecting the shared resource prior to being able to access the shared resource. If the lock is already acquired by another thread, the requesting thread will have to wait for the lock to be “released.”
In some lock implementations, should a thread fail to acquire a particular lock, the thread may continue to repeat attempts to acquire that lock until the lock is acquired. In other lock implementations, a thread may specify a time period indicating how long the thread is willing to wait to acquire a particular lock. If the lock is not acquired within the specified time period, the thread returns to its owning process with an indication that the thread failed to acquire the lock. Certain types of such “abortable” locks may require that an aborting thread wait for action by at least one other thread on the lock. Other types of “abortable” locks are “non-blocking” in that they allow an aborting thread to leave without having to wait for action by another thread.
A relatively simple type of lock implementation that may be abortable involves a thread first attempting to atomically (i.e., in a single step) change a status of a lock from “free” to “owned” using, for example, a test-and-set operation. If the status of the lock is successfully changed, the thread has acquired the lock, thereby allowing the thread to access and modify the shared resource protected by the lock. If the status of the lock is not successfully changed, the thread may repeat the atomic operation; else, if the lock is abortable, the thread may cease attempts to acquire the lock after a specified period of time has elapsed.
Those skilled in the art will note that the test-and-set type lock discussed above has a constant space requirement with respect to how much space in memory is needed for operation of the lock. Particularly, such locks have a space requirement of O(L) for T threads using L locks (i.e., the space required for a particular number of locks is not dependent on the number of threads). Further, those skilled in the art will note that non-blocking aborts of such locks come at a relatively low cost.
In order to avoid adverse increases in memory traffic resulting from repeated atomic operations on the same shared resource, test-and-set type locks may be provided with “backoff.” In these types of lock implementations, if a thread fails to acquire a particular lock, the thread may delay a subsequent attempt to acquire the lock for some time, thereby reducing contention for that lock. Those skilled in the art will note that such “backoff” locks may be implemented with exponential increases in the amount of time between successive failed attempts to acquire a lock.
Typical backoff locks, while simple and relatively effective in low-scale multiprocessor systems, are not well suited for high-scale multiprocessor systems (e.g., systems having hundreds of processors). Further, those skilled in the art will note that it is difficult to make a thread backoff just the right amount, and, as a result, handing a lock from one thread to another may take significantly longer than necessary, thereby decreasing throughput.
Further, those skilled in the art will note that typical backoff locks are not capable of imposing an ordering between threads attempting to acquire a particular lock. Thus, because a thread that has just released a particular lock has that lock cached, that thread may acquire that lock again without allowing other threads to acquire the lock. This results in thread “starvation” (i.e., when certain threads are unable to acquire a particular lock for long periods of time).
A certain type of lock implementation that overcomes at least some of the shortcomings associated with typical backoff locks involves the formation of a queue of threads waiting for a lock. In such “queue” locks, a thread “spins” (i.e., waits) on some data that is part of the lock implementation. A thread in a “wait-queue” may spin, for example, on (i) a node the thread has inserted into the wait-queue or (ii) a predecessor node in the wait-queue. As one thread in the wait-queue releases the lock, another thread proceeds to enter a “critical section” of code for accessing the shared resource protected by the lock. In such a manner, handing a lock from one thread to another may occur relatively quickly and with minimal or no contention for the lock.
Those skilled in the art will note that queue locks, while more scalable than typical backoff locks, are generally not abortable. Certain types of abortable queue locks, although not commonly implemented, have shown to require O(L*T) space for L locks and T threads (i.e., the space needed for a particular number of locks increases as the number of threads increases). Moreover, typical abortable queue locks generally require extra memory management for the nodes in the wait-queue.