1. Field of the Invention
The present invention generally relates to efficient support of synchronization in parallel processing and, more particularly, to methods for building instances of two novel data structures that are optimal in that they permit simultaneous access to multiple readers and one writer without using any synchronization constructs such as locks and/or any special instructions.
2. Background Description
A concurrent data structure is highly concurrent if it permits more than one request to enter and access it simultaneously. Concurrent data structures that are not highly concurrent commonly require the use of a lock at the top of the data structure to serialize access to the data structure. Locking out simultaneous access to a data structure means that excepting one, each of multiple simultaneous access requests has to either wait actively for its access turn to come, or it has to wait passively for its turn to come. In active waiting, a request wastes computation cycles by polling or spinning on some condition till the condition becomes true. In passive waiting, a request tries to reduce computation wastage by say switching to useful computation before returning to attempt data-structure access, or by yielding or context switching the processor to some other computational thread or process. The trade off between active waiting and passive waiting lies between any shifts or context-switches overhead of the latter and active-computation wastage of the former. Independent of any cost incurred by a request in waiting for its turn is the cost always incurred by the request in at least acquiring and releasing the lock associated with the data structure. The overhead of lock usage is significant, especially when multiple, simultaneous attempts at data-structure access make the lock a hot spot of contention. The costs of active waiting, passive waiting, and lock manipulation are all a part of the synchronization cost incurred by a request in accessing a concurrent data structure. Providing high concurrency in data structures as opposed to serialized access is motivated by the goal of reducing synchronization cost in parallel computation. Success in achieving high concurrency has to be measured in terms of (any) gains made in reducing overall synchronization cost, which includes waiting cost, loss of parallel-computation opportunity in data structure access, and serialization cost that includes lock and/or other synchronization-primitive overhead.
It is straightforward to provide highly-concurrent versions of purely read-only data structures. Since such a data structure does not change, it is not necessary to provide a serialization of changes to the data structure. Thus, it does not matter how many requests access the data structure at the same time. Requests accessing the data structure do not have to acquire or release any locks or use other synchronization primitives in order to access the data structure. The introduction of mutability to a data structure changes the situation completely. Unintended and possibly ill-defined mutations and readings of the data structure by interacting concurrent requests, each of which may be behaving correctly if viewed in isolation, have to be ruled out. Taking the cue from the high concurrency of read-only data structures, a step towards making a highly-concurrent version of a mutable data structure is providing a multiple-reader-single-writer version of the data structure. At most, a multiple-reader-single-writer version of a data structure allows multiple readers and one writer to access the data structure at the same time. A multiple-reader-single-writer data structure can also restrict simultaneous access to either multiple readers at one time, or one writer alone at one time. A reader can be a pure reader, which means that it does not write any shared data at all, or it can be an impure reader, which means that it does some auxiliary shared-data writes for say bookkeeping purposes. A multiple-reader-single-writer structure basically supports a broadcast paradigm: One writer writes or broadcasts at one time and multiple readers receive the broadcast concurrently. Although the common multiple-reader-single-writer design often succeeds in increasing parallel access to a data structure, it does not necessarily succeed in reducing the synchronization overhead associated with that access. In fact, a multiple-reader-single-writer scheme can end up being abandoned due to an overall increase in synchronization cost due to increased overheads such as increased lock acquires and releases associated with it. See N. Carriero, "Implementation of Tuple Space Machines", Research Report 567, Yale University Department of Computer Science, December 1987 (also a 1987 Yale University Ph.D. Thesis).