1. Field of the Invention
The present invention relates generally to coordination amongst execution sequences in a multiprocessor computer, and more particularly, to structures and techniques for facilitating non-blocking implementations of shared data structures.
2. Description of the Related Art
A dictionary is an abstract data structure that associates with each of some number of keys a respective value. Depending on the exploitation, keys and values can be data items or data structures. Typical dictionary operations include: insert(k, v), which alters the dictionary so that it associates the value v with the key k, delete(k), which alters the dictionary so that it does not associate any value with the key k; and search(k), which returns the value that the dictionary associates with key k or an appropriate null value if the dictionary does not currently associate any value with key k.
There are, of course, a variety of alternate formulations of dictionaries that are useful for various purposes. Such formulations may provide different operational semantics, or differ in the details of their operations. Exemplary dictionary formulations, including an exemplary formulation (deleteGE(n)) in which a “greater than or equal to” key match criterion is employed for deletions, are described elsewhere herein. Numerous variations on the general theme are possible and, in general, dictionary data structures and suitable concrete implementations thereof, are well known in art. For example, a linked list of nodes that encode key-value pairs forms the basis of a number of suitable implementations. In some implementations, the list of key-value pairs may be sorted in increasing key order. Other implementations include those based on various forms of hash table, e.g., where a hash-table entry may contain both a key and a value.
Still other implementations employ a tree structure, such as a binary tree, where key-value pairs are at the leaves of the tree and every node also contains a key, such that all leaves that are descendants of a node's left-hand child have keys that are less than the node's key and all leaves that are descendants of the node's right-hand child have keys that are not less than the node's key. This arrangement allows an insert, delete, or search operation to be performed in time proportional to the height of the tree. If the tree is balanced, then the height of the tree is proportional to the logarithm of the number of nodes in the tree, so that insert, delete, and search operations may be carried out relatively quickly. However, certain sequences of insert and/or delete operations may leave the tree badly unbalanced. There are many techniques in the computer science literature, known to those of skill in the art, for rebalancing trees as needed.
Pugh has proposed a data structure called a skip list, which he describes as a probabilistic alternative to balanced trees. See W. Pugh, Skip Lists: A Probabilistic Alternative to Balanced Trees, Communications of the ACM, 33(6):668-676, June 1990 and W. Pugh, Concurrent Maintenance of Skip Lists, CS-TR-2222. 1, Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland, College Park, June 1990. We believe that a skip list, particularly as described elsewhere herein, is suitable for use as a concrete implementation of a dictionary.
In some computational environments, including those that could be employed to execute computations that use dictionaries implemented using a skip list, data structures are shared amongst multiple concurrent threads or processes. In such computations, it is desirable for the implementation to behave in a linearizable fashion; that is, to behave as if each operation on that data structure is performed atomically at some point between its invocation and its response. See M. Herlihy and J. Wing, Linearizability: A Correctness Condition for Concurrent Objects, ACM Transactions on Programming languages and Systems, 12(3):463-492, July 1990, for a discussion of linearizability as a correctness criterion.
One way to achieve this property is with a mutual exclusion lock (sometimes called a semaphore). Indeed, Pugh as well as Lotano & Shavit (each summarized below) disclose skip list implementations that employ locks. In general, locking implementations can be understood as follows. When a process issues a request to perform an operation on a shared data structure, its action is to acquire the lock, which has the property that only one process may own it at a time. Once the lock is acquired, the operation is performed on the data structure; only after the operation has been completed is the lock released. This sequence clearly enforces the property of linearizability. However, it is also desirable for operations to interfere with each other as little as possible. For example, it is desirable that two search operations be almost entirely concurrent, rather than one search having to wait to begin until another is entirely finished. Both Pugh's technique and Lotan & Shavit's technique employ locks, though in ways that allow at least some concurrent operations to execute without interference.
In particular, Pugh discloses a technique for implementing a skip list that may be shared by concurrent threads (concurrent processes). The technique includes algorithms that, rather than using a single lock for the entire data structure, associate respective locks with many different parts of the data structure and carefully choose which locks to lock at any given time, so as to ensure that one thread has exclusive access to the associated part of the data structure until that same thread performs a matching unlock operation, while not requiring exclusive access to other parts of the data structure. See W. Pugh, Concurrent Maintenance of Skip Lists (referenced above).
Lotan and Shavit disclose another technique for implementing a skip list that may be shared by concurrent threads and use it to implement a priority queue. Their technique involves allowing a thread to lock a node before deleting it so that the locking thread, and no other, may delete that node from the data structure. See I. Lotan & N. Shavit, Skiplist-Based Concurrent Priority Queues. In Proc. First International Parallel and Distributed Processing Symposium, Cancun, Mexico, May 2000.
Unfortunately, the use of locks has certain drawbacks. In particular, algorithms and shared data structure implementations that employ locks (even on a fine grain basis) are vulnerable to the possibility that individual threads or processes may proceed at very different rates of execution and, in particular, that some thread or process might be suspended indefinitely. Accordingly, it can be highly desirable for an implementation of a shared data structure to be lock-free; that is, if a set of processes are using a shared data structure and an arbitrary subset of those processes are suspended indefinitely, it should still be possible for the remaining processes to make progress in performing operations on the shared data structure. What is needed is a lock-free linearizable implementation of shared skip list.