1. Field of the Invention
This invention relates generally concurrent access to shared objects, and more particularly, to a system and method for executing nested atomic blocks of code using split hardware transactions.
2. Description of the Related Art
In concurrent software designs and implementations, it is often important to ensure that one thread does not observe partial results of an operation that is concurrently being executed by another thread. Such assurances are important for practical and productive software development because, without them, it can be extremely difficult to reason about the interactions of concurrent threads.
Such assurances have often been provided by using locks to prevent other threads from accessing the data affected by an ongoing operation. Unfortunately, the use of locks may give rise to a number of problems, both in terms of software engineering and in terms of performance. First, the right “balance” of locking must be achieved, so that correctness can be maintained, without preventing access to an unnecessary amount of unrelated data (thereby possibly causing other threads to wait when they do not have to). Furthermore, if not used carefully, locks can result in deadlock, causing software to freeze up. Moreover, it is frequently the case that whenever a thread is delayed (e.g. preempted) while holding a lock, other threads must wait before being able to acquire that lock.
Transactional memory is a paradigm that allows the programmer to design code as if multiple locations can be accessed and/or modified in a single atomic step. As typically defined, a transactional memory interface allows a programmer to designate certain sequences of operations as “atomic blocks”, which are guaranteed by the transactional memory implementation to either take effect atomically and in their entirety (in which case they are said to succeed), or have no externally visible effect (in which case they are said to fail). Thus, with transactional memory, it may be possible in many cases to complete multiple operations with no possibility of another thread observing partial results, even without holding any locks. The transactional paradigm can significantly simplify the design of concurrent programs.
Transactional Memory (TM) allows programmers to use transactional or atomic blocks, which may be considered sequential code blocks that should be executed atomically. In other words, executions of atomic blocks by different threads do not appear to be interleaved. To execute an atomic block, the underlying system may begin a transaction, execute the atomic block's memory accesses using that transaction, and then may try to commit the transaction. If the transaction commits successfully, the atomic block's execution seems to take effect atomically at the transaction's commit point. If it fails, the execution does not seem to take effect at all and the atomic block might be retried using a new transaction. It is the responsibility of the TM implementation to guarantee the atomicity of operations executed by transactions.
Transactional memory is widely recognized as a promising paradigm for allowing a programmer to make updates to multiple locations in a manner that is apparently atomic, while addressing many of the problems associated with the use of locks. In general, transactional memory can be implemented in hardware, with the hardware directly ensuring that a transaction is atomic, or in software that provides the “illusion” that the transaction is atomic, even though in fact it is executed in smaller atomic steps by the underlying hardware.
TM can be implemented in hardware (HTM) or in software (STM). While HTM solutions are generally faster than STM ones, many of the traditional HTM implementations do not support certain operations or events (like context switches, interrupts, or even the entry code of a function) while executing a transaction. Generally, if any of these events happens while executing a hardware transaction, the transaction is aborted. Operations that cause hardware transactions to fail or abort may be referred to as Non-Hardware-Transactionable (NHT) operations.
Traditionally, systems implement or support only a single type of transactional memory implementation. Moreover, a programmer generally must know about, and write code to support, the particular interfaces for implementing transactional memory. Furthermore, even if a system supports a particular transactional memory implementation, that implementation may not guarantee to support all transactions. For example, a system may support a “best-effort” hardware transactional memory (HTM) implementation, but since the implementation is “best-effort” not all transactions may be guaranteed to be supported. Thus, a programmer may wish to include functionality to fall back to a more flexible, if slower, transactional memory implementation that may guarantee support for all transactions. In order to do so, the programmer may have to specifically write code to support both the faster “best-effort” implementation and the slower fallback implementation at every location in the application for which the programmer wishes to execute instructions atomically.
Additional complications may be introduced when blocks of code are nested sequentially or in parallel. Traditionally, systems that support transactional memory implementations are not configured to efficiently support atomic execution of such nested blocks. For example, in typical systems supporting transactional memory operations, if a nested child transaction fails, its parent transaction must also fail.