This present disclosure relates generally to transactional memory systems, and more specifically, to a transactional memory system implementing nested transactions.
The number of central processing unit (CPU) cores on a chip and the number of CPU cores connected to a shared memory continue to grow significantly to support growing workload capacity demand. The increasing number of CPUs cooperating to process the same workloads puts a significant burden on software scalability. For example, shared queues or data-structures protected by traditional semaphores have become hot spots and lead to sub-linear n-way scaling curves. Traditionally this has been countered by implementing finer-grained locking in software, and with lower latency/higher bandwidth interconnects in hardware. Implementing fine-grained locking to improve software scalability can be very complicated and error-prone, and at today's CPU frequencies, the latencies of hardware interconnects are limited by the physical dimension of the chips and systems, and by the speed of light.
Implementations of hardware Transactional Memory (HTM, or in this discussion, simply TM) systems have been introduced to counter the burden on software scalability that has emerged as a result to support growing workload capacity demand. Transaction memory systems utilize a group of instructions—typically referred to as a transaction—to operate in an atomic manner on a data structure in memory, as viewed by other central processing units (CPUs) and the I/O subsystem (atomic operation is also known as “block concurrent” or “serialized” in other literature). During operation, a transaction executes optimistically without obtaining a lock, but may need to abort and retry the transaction execution if an operation, of the executing transaction, on a memory location conflicts with another operation on the same memory location. Previously, software transactional memory implementations have been proposed to support software Transactional Memory (TM). However, hardware TM can provide improved performance aspects and ease of use over software TM.
Multiprocessing systems have been employed to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide a highly reliable thread (or thread group). Each paired microprocessor or processor cores provides a highly reliable thread for high-reliability connection with system components such as a memory hierarchy, an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. Each processor core includes a transactional execution facility, wherein the system is configured to enable processor rollback to a previous state responsive to a transaction abort, e.g., either in order to recover from transaction interference when an incorrect execution has been detected due to interference from another memory operation not corresponding to the transaction, or a transaction abort instruction.
Recent trends in transactional memory architecture have led to a desire to incorporate nested transactions. Conventional implementations of nesting employ a “flattened nesting” technique, where multiple levels of nested transactions are combined into a single level, i.e., a single transaction. In such an embodiment, nesting is only used to track when the “super-transaction” that subsumes all nested transactions (which are flattened into the super-transaction) ends. Consequently an unnecessarily large rollback results when an interference is detected with a nested transaction because the entire super-transaction is rolled back rather than only the inferior transactions (i.e., nested transaction).