As many-core computer platforms become more common, providing support for practical parallel programming is becoming more critical. However, as core performance increases, the time spent (i.e., overhead) synchronizing between threads (e.g., tasks, code sequences, processes) becomes a greater portion of execution time. For parallel programming, synchronizing between threads is typically accomplished by using locks. A lock is commonly understood to be a synchronization method for enforcing access limits to resources among and between multiple threads, where threads acquire a lock before accessing a corresponding resource and release the lock after utilizing the resource.
On these many-core computer platforms, executing programs with traditional lock-based synchronization can create well-known problems dependent on the granularity, which is a measure of the amount of data the lock is protecting. With coarse-grained locks, a smaller number of locks, each of which protects a relatively large segment of data, is used. However, coarse-grained locks create issues with overhead because there is a higher likelihood that the lock will stop a parallel thread from proceeding. Alternatively, utilizing fine-grained locks, a larger number of locks with each protecting a relatively small amount of data, introduces increased complexity to avoid deadlock and to guarantee correctness in execution. A deadlock commonly refers to a restrictive state in which threads are blocked forever because of cyclic dependencies.
To address the above-mentioned difficulties with a lock-based instruction construct, a method of transactional memory has been proposed to simplify concurrency management by supporting parallel tasks as transactions, which appear to execute atomically and in isolation. Using transactional memory, programmers can achieve increased parallel performance with identified, coarse-grained transactions. Furthermore, transactions address other challenges of lock-based parallel execution such as deadlocks and robustness to failures.
With transactional memory, programmers define atomic code sequences or transactions, which may include unstructured flow control and any number of memory references. The transactional memory system executes transactions correctly by generally providing: (1) atomicity: either the whole transaction executes or none of it; (2) isolation: partial memory updates are not visible to other transactions; and (3) consistency: there appears to be a single transaction completion order across the whole system. If these provisions are true at the end of its execution, the transaction commits its writes to shared memory. If not, the transaction violates and its writes are rolled back.
For legacy programs that are already implemented with a lock-based instruction construct, there are opportunities to expose more concurrency by converting the lock-based instruction construct to a transactional instruction construct. Therefore, it is desirable to convert the lock-based instruction construct into a transaction instruction construct using a translator. However, it is a challenge to convert these lock-based sections to transactions correctly. With conversions to transactional memory, there is still chance of deadlock among the transactions converted from lock-based sections.