Traditionally, synchronization between threads accessing shared memory has been realized using locks to protect the shared data from simultaneous access. However, locks often serialize access to the shared data, which might not always be necessary at run-time, but is often challenging, or sometimes impossible, to determine when the code is written. For example, a lock may protect access to an entire hash table, but threads might access distinct buckets in the hash table at run-time. Thus, in this case, the lock unnecessarily limits parallelism, potentially resulting in reduced program performance. A better use of parallel resources would be to allow all threads accessing distinct data to run in parallel. Fine grained locking could be used in this manner to improve program performance, but it is often more challenging for the programmer to design and verify.
Transactional memory has been proposed as an alternative solution, allowing threads to speculatively execute critical sections, called “transactions,” in parallel. If a conflict occurs at run-time, then threads stall or roll back their transactions and execute them again to resolve the conflict. Although promising, many experts believe the sole use of software transactional memory (STM), where transactions are implemented entirely in software to synchronize shared memory in multithreaded programs, incurs too much overhead to be used as a general solution.
Moreover, transactional memory requires the application programmer to re-write the application using transactional annotations to mark the beginning and ending of transactions, and, in some cases, also shared memory accesses. This effort may not be substantial when developing new software, but for legacy applications with potentially thousands of lines of code (or more), it could be a significant undertaking.
Speculative Lock Elision (SLE) provides a method to execute critical sections in parallel as speculative transactions for code that was written to use locks. The programmer needs to prefix the lock-acquire and lock-release with specific prefixes to indicate where the critical section begins and ends. If the transaction aborts, the hardware rolls back execution to the beginning of the critical section, acquires the lock and continues in non-speculative mode. Unfortunately, this implementation restricts the execution to only one non-speculative critical section at a time: only the thread that acquired the lock can execute non-speculatively. Moreover, in order to guarantee correctness, no other speculative critical section can execute concurrently with the non-speculative one.