This disclosure relates generally to cache management in a multiprocessor computing environment, and more specifically to transactional memory concurrency control mechanisms.
The number of central processing unit (CPU) cores on a chip and the number of CPU cores connected to a shared memory continues to grow significantly to support growing workload capacity demand. The increasing number of CPUs cooperating to process the same workloads puts a significant burden on software scalability; for example, shared queues or data-structures protected by traditional semaphores become hot spots and lead to sub-linear n-way scaling curves. Traditionally this has been countered by implementing finer-grained locking in software, and with lower latency/higher bandwidth interconnects in hardware. Implementing fine-grained locking to improve software scalability can be very complicated and error-prone, and at today's CPU frequencies, the latencies of hardware interconnects are limited by the physical dimension of the chips and systems, and by the speed of light.
Implementations of hardware Transactional Memory (HTM or in this discussion simply TM) have been introduced, wherein a group of instructions, called a transaction, operate in an atomic manner on a data structure in memory as viewed by other central processing units (CPUs) and the I/O subsystem (atomic operation is also known as “block concurrent” or “serialized” in other literature). The transaction executes optimistically without obtaining a lock, but may need to abort and retry the transaction execution if an operation, of the executing transaction, on a memory location conflicts with another operation on the same memory location. Previously, software transactional memory implementations have been proposed to support software Transactional Memory (TM). However, hardware TM can provide improved performance aspects and ease of use over software TM.
U.S. Patent Application Publication No. US20080244354 A1 titled “Apparatus and method for redundant multi-threading with recovery,” published Oct. 2, 2008 and incorporated by reference herein teaches:                A method and apparatus for reducing the effect of soft errors in a computer system is provided. Soft errors are detected by combining software redundant threading and instruction duplication. Upon detection of a soft error, errors are recovered through the use of software check pointing/rollback technology. Reliable regions are identified by vulnerability profiling and redundant multi-threading is applied to the identified reliable regions. U.S. Patent Application Publication No. US20080244354 A1 (published Oct. 2, 2008).        
U.S. Patent Application Publication No. US20120210162 A1 titled “State recovery and lockstep execution restart in a system with multiprocessor pairing,” published Aug. 16, 2012 and incorporated by reference herein teaches:                System, method and computer program product for a multiprocessing system to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). Each paired microprocessor or processor cores that provide one highly reliable thread for high-reliability connect with a system components such as a memory “nest” (or memory hierarchy), an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. The memory nest is attached to a selective pairing facility via a switch or a bus. Each selectively paired processor core is includes a transactional execution facility, wherein the system is configured to enable processor rollback to a previous state and reinitialize lockstep execution in order to recover from an incorrect execution when an incorrect execution has been detected by the selective pairing facility. U.S. Patent Application Publication No. US20120210162 A1 (published 08-16-2012).        
U.S. Pat. No. 8,122,195 titled “Instruction for Prefetching Data and Releasing Cache Lines,” issued 2012 Feb. 21 teaches:                A prefetch data machine instruction having an M field which performs a function on a cache line of data specifying an address of an operand. The operation comprises either prefetching a cache line of data from memory to a cache or reducing the access ownership of store and fetch or fetch only of the cache line in the cache or a combination thereof. The address of the operand is either based on a register value or the program counter value pointing to the prefetch data machine instruction. U.S. Pat. No. 8,122,195, at Abstract (issued 2012 Feb. 21).        
U.S. Pat. No. 7,966,453 titled “Method and Apparatus for Active Software Disown of Cache Line's Exclusive Rights,” issued 2011 Jun. 21, teaches:                A method in which software indicates to hardware of a processing system that its storage modification to a particular cache line is done, and will not be doing any modification for the time being. With this indication, the processor actively releases its exclusive ownership by updating its line ownership from exclusive to read-only (or shared) in its own cache directory and in the storage controller (SC). By actively giving up the exclusive rights, another processor can immediately be given exclusive ownership to that said cache line without waiting on any processor's explicit cross invalidate acknowledgement. This invention also describes the hardware design needed to provide this support. U.S. Pat. No. 7,966,453, at Abstract (issued 2011 Jun. 21).        