This disclosure relates generally to transaction processing in a multi-processor computing environment with transactional memory, and more specifically to instructions executed within a transaction that alter the function of later-executed instructions within the same transaction.
The number of central processing unit (CPU) cores on a chip and the number of CPU cores connected to a shared memory continue to grow significantly to support growing workload capacity demand. The increasing number of CPUs cooperating to process the same workloads puts a significant burden on software scalability; for example, shared queues or data-structures protected by traditional semaphores become hot spots and lead to sub-linear n-way scaling curves. Traditionally this has been countered by implementing finer-grained locking in software, and with lower latency/higher bandwidth interconnects in hardware. Implementing fine-grained locking to improve software scalability can be very complicated and error-prone, and at today's CPU frequencies, the latencies of hardware interconnects are limited by the physical dimension of the chips and systems, and by the speed of light.
Implementations of hardware Transactional Memory (HTM, or in this discussion, simply TM) have been introduced, wherein a group of instructions—called a transaction—operate in an atomic manner on a data structure in memory, as viewed by other central processing units (CPUs) and the I/O subsystem (atomic operation is also known as “block concurrent” or “serialized” in other literature). The transaction executes optimistically without obtaining a lock, but may need to abort and retry the transaction execution if an operation of the executing transaction on a memory location conflicts with another operation on the same memory location. Previously, software transactional memory implementations have been proposed to support software Transactional Memory (TM). However, hardware TM can provide improved performance aspects and ease of use over software TM.
U.S. Pat. No. 6,014,735 titled “Instruction Set Extension Using Prefixes” filed 1998 Mar. 31, incorporated herein by reference in its entirety, teaches a method and apparatus for encoding an instruction in an instruction set which uses a prefix code to qualify an existing operation code (opcode) of an existing instruction. An opcode and an escape code are selected. The escape code is selected such that it is different from the prefix code and the existing opcode. The opcode, the escape code, and the prefix code are combined to generate an instruction code which uniquely represents the operation performed by the instruction.
U.S. Patent Application Publication No. 2010/0205408 titled “Speculative Region: Hardware Support for Selective Transactional Memory Access Annotation Using Instruction Prefix” filed 2010 Apr. 20, incorporated herein by reference in its entirety, teaches a computer system and method for executing selectively annotated transactional regions. The system is configured to determine whether an instruction within a plurality of instructions in a transactional region includes a given prefix. The prefix indicates that one or more memory operations performed by the processor to complete the instruction are to be executed as part of an atomic transaction. The atomic transaction can include one or more other memory operations performed by the processor to complete one or more others of the plurality of instructions in the transactional region.