This disclosure relates generally to the field of Transactional Memory (TM) execution, and more specifically to the modification of code to optimize the processing of multiple memory transactions as a single transaction.
The number of central processing unit (CPU) cores on a chip and the number of CPU cores connected to a shared memory continues to grow significantly to support growing workload capacity demand. The increasing number of CPUs cooperating to process the same workloads puts a significant burden on software scalability; for example, shared queues or data-structures protected by traditional semaphores become hot spots and lead to sub-linear n-way scaling curves. Traditionally this has been countered by implementing finer-grained locking in software, and with lower latency/higher bandwidth interconnects in hardware. Implementing fine-grained locking to improve software scalability can be very complicated and error-prone, and at today's CPU frequencies, the latencies of hardware interconnects are limited by the physical dimension of the chips and systems, and by the speed of light.
Implementations of hardware Transactional Memory (HTM, or in this discussion, simply TM) have been introduced, wherein a group of instructions—called a transaction—operate in an atomic manner on a data structure in memory, as viewed by other central processing units (CPUs) and the I/O subsystem (atomic operation is also known as “block concurrent” or “serialized” in other literature). The transaction executes optimistically without obtaining a lock, but may need to abort and retry the transaction execution if an operation, of the executing transaction, on a memory location conflicts with another operation on the same memory location. Previously, software transactional memory implementations have been proposed to support software Transactional Memory (TM). However, hardware TM can provide improved performance aspects and ease of use over software TM.
U.S. Pat. No. 7,730,286 titled “Software Assisted Nested Hardware Transactions” filed Dec. 30, 2005, incorporated by reference herein teaches a method and apparatus for efficiently executing nested transactions. Hardware support for execution of transactions is provided. Additionally, through the use of logging previous values immediately before a current nested transaction in a local memory and storage of a stack of handlers associated with a hierarchy of transactions, nested transactions are potentially efficiently executed. Upon a failure, abort, or invalidating event/access within a nested transaction, the state of variables or memory locations written to during execution of the nested transaction are rolled-back to immediately before the nested transaction, instead of all the way back to an original state of the variables or memory locations before an enclosing transaction. As a result, nested transactions may be re-executed within enclosing transactions, without flattening the enclosing and nested transactions to re-execute everything.
US Patent Application No. 2010/0205408A1, filed Apr. 20, 2010 titled “Speculative Region: Hardware Support for Selective Transactional Memory Access Annotation Using Instruction Prefix” incorporated by reference herein teaches a computer system and method for executing selectively annotated transactional regions. The system is configured to determine whether an instruction within a plurality of instructions in a transactional region includes a given prefix. The prefix indicates that one or more memory operations performed by the processor to complete the instruction are to be executed as part of an atomic transaction. The atomic transaction can include one or more other memory operations performed by the processor to complete one or more others of the plurality of instructions in the transactional region.