In software, data structures called queues are frequently used to temporarily hold data for use by software threads. With multiple threads operating on a single queue at the same time, there can be multiple operations executing simultaneously. Multiple operations executing simultaneously on a single queue will generally lead to errors in modifying the queue state because one thread will read the state and begin modifying it while another thread changes the queue state to something incompatible with what the first thread is going to write back. This problem occurs because it is generally assumed in the implementation of operations that nothing else is modifying the queue while the operation is executing, i.e., operations execute atomically.
The result is that the threads all “take turns” executing operations on a shared queue, i.e., access from multiple threads is serialized. This discipline of access is enforced in the operations by using mutual exclusion locks (“mutexes”) that block all threads but one from executing an operation. When a thread is done executing an operation, the next thread waiting is then allowed to execute.
This is not a problem when the threads are all executing on a single processor (except perhaps for the extra overhead incurred from the mutex implementation) since only one thread can execute at a time. However, when the process is executing on a multiprocessor system, this serialization of operation execution reduces the gain in throughput that would have otherwise occurred by having the threads executing on the multiple processors simultaneously.
The problem is further exacerbated in that it is possible for a thread to acquire a mutex and then be preempted by the operating system scheduler before the thread releases the mutex. This type of preemption blocks all other threads that need to execute a queue operation until the preempted thread resumes execution and releases the mutex.
The use of conventional lock-free algorithms may also introduce issues that arise in the design of lock-free algorithms that are not normally encountered with other algorithms. One is known as the ABA problem, which is when an instruction can't make the distinction between the memory location having never been changed and being changed but then being changed back to the expected value. Assumptions associated with the expected value can change. A common approach to eliminating the ABA problem is to attach an update counter to the memory location being updated. The counter is incremented each update, so even if the same value is assigned to the location, the update counter will be different.
Another problem associated with conventional lock-free designs is memory reclamation. Given that multiple threads can be executing operations simultaneously, even though one thread has determined that a shared object is no longer needed, it is sometimes difficult to be certain that no other thread is attempting to access that shared object. Returning the object to the memory allocator could result in runtime errors if other threads are attempting to access the object.
As such, there are a number of challenges and inefficiencies created in traditional systems that have multiple threads operating on a single queue at the same time. For example, multiple threads sharing a single queue are unable to concurrently access the queue. Thus, it can be difficult to adequately utilize all of the resources available. It is with respect to these and other problems that embodiments of the present invention have been made.