It is common in multiprocessing and multithreaded computing environments for various executable units running on a computer system to concurrently execute multiple jobs scheduled in a queue, which is accessed by multiple threads and/or multiple executable units.
A common problem associated with using data structures in shared memory is managing multiple simultaneous requests to access the data structures and ensuring that accesses to the data are atomic. Additionally, guaranteeing atomic access is important because it ensures that multiple simultaneous attempts to update data do not conflict and leave the data in an inconsistent state.
Atomic access to a data structure can be guaranteed by the hardware when the data structure meets size and alignment restrictions imposed by the particular hardware (typically the size of a machine word or floating point number). Atomic access cannot be guaranteed by the hardware for data structures that do not meet these restrictions.
In order to guarantee consistent reads and writes of data structures larger than that supported by the hardware, previous systems have provided software mechanisms to guarantee atomic reads of data structures. One such system involves the use of a lock mechanism. In systems using a lock, a thread that requires access to a shared data structure first acquires a lock on the data structure, typically using a function provided by the operating system. The process then updates the data structure. After the data structure is updated, the requesting thread releases the lock. Other threads that require access to the data structure may also attempt to acquire a lock on the data structure. If an attempt occurs while another thread has the data structure locked, the attempt will fail, and the requesting thread will continue to retry acquisition of the lock, or wait until the lock becomes available with operating system support. In either approach, the thread of execution is blocked until the lock has been acquired.
While software locks allow exclusive and therefore consistent access to data structures, the blocking behavior of locks is expensive either in terms of CPU (central processing unit) or memory utilization. There has been a lack of efficient way to implement a FIFO (first-in first-out) queue in a non-blocking fashion using atomic operations.