Transactional Memory (TM) is a programming model used with the intention of making programming with threads simpler, with the goal of synchronizing access to data shared between several threads into transactions. Each transaction is executed atomically, meaning that they will either succeed and commit to the data store, or abort and restart. In addition, transactions are isolated from one another such that each transaction sees a consistent view of the memory. In other words, TM is a programming model that enables a series of read and write operations to complete atomically, similar to an atomic compare-and-swap command. A transaction should be aborted if it can result in inconsistent state resulting from concurrent reads/writes by other transactions into the system.
Many TM system proposals range from hardware to software and hardware-software co-designs. When multi-core and many-core processors emerged, TM innovation began to focus on scalability of TM systems and interoperation of different TM systems. TM has been implemented in consumer products such as the Haswell and its successors, from Intel Corporation of Mountain View, Calif., United States. A Graphics Processing Unit (GPU) is a throughput-oriented computing device characterized by large arithmetic density, high memory bandwidth and a high degree of parallelism. GPU design is evolving towards a general-purpose computing device, with growing support for irregular workloads and data structures that are traditionally non-GPU oriented.
Recently, hardware based TM systems for GPUs have been proposed, offering performance comparable to fine-grained locking (synchronizations between threads in thread blocks) that are as easy to use as coarse-grained locking (synchronizations between threads), making it a competitive tool for exploiting a full potential of GPUs. Most existing TM systems implement a 2-PhaseLocking (2PL) concurrency control mechanism, which aborts transactions on both write-read conflicts and write-write conflicts.