Many applications that use multiple threads of execution that share data in a common memory or communicate through a common memory employ some type of mechanism to ensure data consistency. One type of mechanism provides for mutual exclusion, that is, it makes use of a “mutual exclusion lock” to ensure that only one of the execution threads can enter a critical region in which the shared data is manipulated. If more than two threads of execution are sharing a location, multiple threads may attempt to obtain ownership of the mutual exclusion lock simultaneously. Under such conditions, the order in which the threads obtain the mutual exclusion lock ownership is not guaranteed.
When multiple threads of execution on one or more processors are sharing data, a mutual exclusion lock (“mutex”) is used to provide ownership of the shared data to only one agent at a time. The use of a mutex allows the thread that holds the mutex to make one or more modifications to the contents of a shared record, or a read-modify-write to update the contents, while maintaining consistency within that record.
In general, a mutex is implemented as a location in memory, which is used to signal both intent to own, and ownership of, another region protected by the mutex. There are many techniques, using software or a combination or software and hardware, to implement the acts of obtaining (entering) and releasing (exiting) the mutex. A thread of execution which enters a critical region of code in which shared data is modified is preceded by a mutex_enter( ) operation and is followed by a mutex_exit( ) operation. Techniques for implementing mutex_enter( ) and mutex_exit( ) operations are well known.
In some applications, the use of such competitive mutual exclusion locks is insufficient to control access to the shared data due to serialization constraints. Thus, other techniques to enforce order are used.
For example, in some networking applications, order is maintained by requiring that all cells or packets traveling between a particular pair of endpoints be handled by the same process or thread, or through the use of hardware pipelining. The former limits the throughput for any pair of endpoints to the performance of the single thread, thus parallelism can only be achieved among flows from unrelated pairs of endpoints, while strict order is maintained within any end-to-end flow. The latter can achieve higher throughput but is limited in flexibility and performance by how well tasks partition into fixed pipeline stages, as well as the degree to which communication between those stages can be minimized. Other network applications maintain order through the use of sequence numbers associated with each cell or packet, but some network protocols do not support the use of sequence numbers.