One or more aspects relate, in general, to multiprocessing computing environments, and in particular, to transactional processing within such computing environments.
The number of central processing unit (CPU) cores on a chip and the number of CPU cores connected to a shared memory continues to grow significantly to support growing workload capacity demand. The increasing number of CPUs cooperating to process the same workloads puts a significant burden on software scalability. For example, shared queues or data structures protected by traditional semaphores become hot spots and lead to sub-linear, n-way scaling curves. Traditionally, this has been countered by implementing finer-grain locking and software, and with low-latency/higher-bandwidth interconnects and hardware. Implementing fine-grain locking to improve software scalability can be very complicated and error-prone, and at today's CPU frequencies, the latencies of hardware interconnects are limited by the physical dimension of the chips and systems, and by the speed of light.
Implementations of hardware transactional memory (TM) have been introduced, wherein a group of instructions, called a transaction, operate atomically, and in isolation (sometimes called “serializeability”) on a data structure in memory. The transaction executes optimistically without obtaining a lock, but may need to abort and retry the transaction execution if an operation, of the executing transaction, on a memory location conflicts with another operation on the same memory location. A conflict may occur, for instance, when one processor core writes data that another processor core is reading. A transactional core conventionally does not have any means to avoid or defer a conflict, and thus, it aborts the transaction.