1. Field of the Invention
The present invention relates generally to coordination amongst execution sequences in a multiprocessor computer, and more particularly, to techniques for facilitating implementations of concurrent data structures and/or programs.
2. Description of the Related Art
Interest in atomic multi-location synchronization operations dates back at least to the Motorola MC68030 chip, which supported a double-compare-and-swap operation (DCAS). See generally, Motorola, MC68030 User's Manual, Prentice-Hall (1989). A DCAS operation generalizes a compare-and-swap (CAS) to allow atomic access to two locations. DCAS has also been the subject of recent research. See e.g., O. Agesen, D. Detlefs, C. Flood, A. Garthwaite, P. Martin, M. Moir, N. Shavit, and G. Steele, DCAS-based Concurrent Deques, Theory of Computing Systems, 35:349-386 (2002); D. Detlefs, P. Martin, M. Moir, and G. Steele, Lock-free Reference Counting, Distributed Computing, 15(4):255-271 (2002); and M. Greenwald, Non-Blocking Synchronization and System Design, Ph.D. Thesis, Stanford University Technical Report STAN-CS-TR-99-1624 (1999).
In general, the implementation of concurrent data structures is much easier if one can apply atomic operations to multiple non-adjacent memory locations. However, despite the early MC68030 support for DCAS and despite some research interest multi-location synchronization, current processor architectures, by and large, support atomic operations only on small, contiguous regions of memory (such as a single or double word).
As a result, the current literature offers two extremes of nonblocking software synchronization support for concurrent data structure design: intricate designs of specific structures based on single-location operations such as compare-and-swap (CAS), and general-purpose multi-location transactional memory implementations. While the former are sometimes efficient, they are invariably hard to extend and generalize. The latter are flexible and general, but typically costly.
In an early paper, Herlihy and Moss described transactional memory, a more general transactional approach where synchronization operations are executed as optimistic atomic transactions in hardware. See M. Herlihy and J. E. B. Moss, Transactional Memory Architectural Support for Lock-free Data Structures, In Proc. 20th Annual International Symposium on Computer Architecture (1993).
Barnes proposed a software implementation of a K-location read-modify-write. See e.g., G. Barnes, A Method for Implementing Lock-free Shared Data Structures, In Proc. 5th ACM Symposium on Parallel Algorithms and Architectures, pp. 261-270 (1993). That implementation, as well as those of others (see e.g., J. Turek, D. Shasha, and S. Prakash, Locking without Blocking: Making Lock-based Concurrent Data Structure Algorithms Nonblocking, In Proc. 11th ACM Symposium on Principles of Database Systems, pp. 212-222 (1992); A. Israeli and L. Rappoport, Disjoint-Access-Parallel Implementations of Strong Shared Memory Primitives, In Proc. 13th Annual ACM Symposium on Principles of Distributed Computing, pp. 151-160 (1994)) was based on a cooperative method where threads recursively help all other threads until an operation completes. Unfortunately, this method introduces significant overhead as redundant “helping” threads do the work of other threads on unrelated locations because a chain of dependencies among operations exists.
Shavit and Touitou coined the term software transactional memory (STM) and presented the first lock-free implementation of an atomic multi-location transaction that avoided redundant “helping” in the common case, and thus significantly outperformed other lock-free algorithms. See N. Shavit and D. Touitou, Software Transactional Memory, Distributed Computing, 10(2):99-116 (1997). However, the described formulation of STM was restricted to “static” transactions, in which the set of memory locations to be accessed was known in advance.
Moir, Luchangco and Herlihy have described an obstruction-free implementation of a general STM that supports “dynamic” multi-location transactions. See commonly-owned, co-pending U.S. patent application Ser. No. 10/621,072, entitled “SOFTWARE TRANSACTIONAL MEMORY FOR DYNAMICALLY SIZABLE SHARED DATA STRUCTURES” filed 16 Jul. 2003 naming Mark S. Moir, Victor Luchangco and Maurice Herlihy as inventors. Moir, Luchangco and Herlihy have also described an obstruction-free implementation of a multi-location compare-and-swap (KCAS) operation, i.e., a k-location compare-and-swap on non-adjacent locations. See commonly-owned, co-pending U.S. patent application Ser. No. 10/620,747, entitled “OBSTRUCTION-FREE MECHANISM FOR ATOMIC UPDATE OF MULTIPLE NON-CONTIGUOUS LOCATIONS IN SHARED MEMORY” filed 16 Jul. 2003 naming Mark S. Moir, Victor Luchangco and Maurice Herlihy as inventors.
While such obstruction-free implementations can avoid helping altogether, thereby reducing the algorithm complexity of the algorithm and eliminating associated overheads, further reductions are desired. Indeed, the strong semantics of the aforementioned techniques, e.g., full multi-location transaction support, generally come at a cost. Further, full multi-location transaction support may be overkill for some important software applications such as linked-list manipulations. What is needed is reasonably efficient, though potentially-weaker, multi-location operations that are general enough to reduce the design complexities of algorithms based on CAS alone.