A computer's processing unit executes an instruction stream from a program text of instructions. Each instruction specifies its successor; either the subsequent instruction, or, in the case of a branch or call, some other instruction. So a processor executes one instruction at a time (so-called pipelined and “out-of-order” processors violate this in their implementation, but preserve these semantics). A program generally compiles to a program text with a distinguished start instruction. In a C program, for example, the first instruction of the “main” method is the distinguished start instruction. The “processor context” that determines the sequence of instructions executed after this is often called a “thread of control,” or just a “thread.” Programs execute in an operating system process, which provides a virtual address space, which allows each process to behave as if it has sole access to all the memory of a “virtual” machine. The operating system process, in addition to the virtual address space, also provides various per-process operating resources, such as file descriptors, and one or more threads. Traditional programs are single-threaded: they execute in a process with only a single thread of control.
A shared-memory multiprocessor has several processors sharing access to the same memory; a write by one processor may be observed by a subsequent read by another processor. Such a machine can be used by running several different programs, each in a process, on the different processors. In this mode, we do not really make use of the shared memory, since the processes each have separate address spaces. In another mode, however, a program may create several threads of control in the process in which it executes, and these threads may execute simultaneously on the multiple processors, and communicate through the shared memory. (Such a multi-threaded, or concurrent program may also execute on a uniprocessor, and in general a program may create more threads than there are available processors. One of the jobs of the operating system is to schedule execution of the runnable threads on the available processors. Thus a running thread may be interrupted at an arbitrary instruction to allow another thread to resume.)
This simultaneous interleaved execution of instructions by the threads makes concurrent programming very difficult. As an analogy, imagine a deck of cards that have been separated such that all the red cards are in one pile and all the black cards are in a second pile. Each card represents an instruction and each pile represents a thread. Combine the piles together using a bridge technique of shuffling. The order of the red cards has not changed relative to each other nor has the order of the black cards but the cards have become interleaved. This is exactly what happens when threads execute concurrently. It should also be clear that there are a very large number of possible interleavings, each representing a possible execution. The program must work correctly for all such possible executions.
When threads execute in a concurrent computing environment, mechanisms are required to manage how each thread interacts with system resources such shared memory. Software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. A transaction in the context of transactional memory is a piece of code that executes a series of reads and writes to shared memory, and does so atomically, with the entire transaction executing as if it is the only thread of control executing in the system. If transaction Tx1 observes any write by transaction Tx2, then it observes all writes by Tx2. A data location in the context of transactional memory is the particular segment of shared memory being accessed, such as a single object, a cache line (such as in C++), a page, a single word, etc. One type of concurrency control lock mode in transactional memory systems is optimistic concurrency control, or optimistic locking.
With optimistic concurrency control, the system attempts to make forward progress at the risk that a conflict will be detected later on. The transactional memory system performs automatic resolution of such conflicts, often by rolling back one of the conflicting transactions and re-executing it. Optimistic operations are relatively inexpensive when compared to pessimistic operations since they just read and do not involve writes to shared locations (i.e. taking a lock). As the name implies, the hope for optimistic operations is that there are few conflicts. If this turns out to be false, then there will be already wasted work, and the system must then proceed to throw it away and attempt to resolve the conflict.
One serious issue that optimistic concurrency control does not explicitly address can occur in privatization scenarios. Privatization-related problems may occur when a program has concurrent threads executing transactions that access the same shared memory locations, and one of these transactions privatizes some shared memory location. Privatization occurs when a transaction performs operations that make a shared memory location accessible only to the transaction. For example, if the only reference to some object O is stored in some globally accessible queue Q, and transaction Tx1 being executed by thread T1 performs an operation that removes the reference to O from Q, and stores it into a local variable T1, then Tx1 has privatized O to T1.
With some implementations of STM, privatization can cause unexpected results to occur. Some STM implementations have attempted to achieve high performance by combining optimistic reading with “in-place” writing, in transactional writes are performed directly to a memory location. When these techniques are used to implement a program that performs privatization, the following scenario is possible. Some global location G contains a unique pointer to a shared data structure. Two threads execute transactions that attempt to access this data structure concurrently. Thread T1 executes transaction Tx1, which will read G, and, if the pointer read is non-null, attempt to increment an integer in the data structure to which the pointer refers. Thread T2 executes transaction Tx2, which will copy G into a thread-local variable, and set G to null. Thread T2 then accesses the data structure via the thread-local pointer variable, believing that it has successfully “privatized” the data structure by setting G to null. However, with optimistic reads and in-place writes, one possible execution has Tx1 read G first, observing a non-NULL value. Now Tx2 executes in its entirety. Tx2 has written a location, G, that Tx1 has read, thus “dooming” Tx1 to abort, but this will not be discovered until Tx1 attempts to commit. So Tx1 continues executing, incrementing a field in the data structure. This increment will be undone when Tx1 fails to commit, but from the point of view of the non-transactional code executing after Tx2 in thread T2, both this write and the write that performs the “undo” operation are “inexplicable;” they occur for no reason, and may make the program run incorrectly.
Another class of privatization-related problems involves “serialization anomalies.” As discussed previously, transactions simplify concurrent programming by providing the programmer the illusion that concurrent transactions execute in some serial order. In particular, if a read by transaction Tx2 observes a write by transaction Tx1, then Tx2 must be serialized after Tx1. A serialization anomaly occurs when transactions complete in an order different from their serialization order. When a program employs a privatization idiom, this can cause the non-transactional code executing in a thread after one of the transaction completes to observe “inexplicable” writes.