1. Field
The present disclosure relates to computer systems and methods in which data resources are shared among data consumers while preserving data integrity and consistency relative to each consumer. More particularly, the disclosure concerns implementations of mutual exclusion mechanisms such as reader-writer locking.
2. Description of the Prior Art
By way of background, reader-writer synchronization is a mutual exclusion technique that is suitable for use in shared memory multiprocessor computing environments to protect a set of shared data. One type of reader-writer synchronization, known as reader-writer locking, allows read operations (readers) to share lock access in order to facilitate parallel data reads, but requires write operations (writers) to obtain exclusive lock access for writing the data. The technique is well suited to shared memory multiprocessor computing environments in which the number of readers accessing a shared data set is large in comparison to the number of writers, and wherein the overhead cost of requiring serialized lock acquisition for readers would be high. For example, a network routing table that is updated at most once every few minutes but searched many thousands of times per second is a case where serialized read-side locking would be quite burdensome.
Reader-writer locks are conventionally implemented using a single global lock that is shared among processors. This approach requires readers and writers to contend for one global lock on an equal footing, but produces memory contention delays due to cache line bouncing of the lock between each processor's cache. Insofar as reader-writer locks are premised on the existence of a read-intensive processing environment, readers may be unduly penalized, especially if their critical sections are short and their lock acquisition frequency is high. A distributed reader-writer lock approach is presented in Hsieh and Weihl, “Scalable Reader/Writer Locks for Parallel Systems”, 1991. It requires the readers to acquire only a local per-processor reader/writer lock that will usually reside in the memory cache of the processor that hosts the acquiring reader. However, the writers must acquire all of the local reader/writer locks, which degrades writer performance due to memory contention, and in some cases due to new readers being allowed to starve a writer while the latter is waiting for one of the local reader/writer locks. A further disadvantage associated with both non-distributed and distributed reader-writer locking is that lock acquisition imposes a burden on readers, even in the absence of a writer. Reader-writer locks are typically implemented as semaphores, mutex locks and spinlocks. Acquiring each of these lock types often imposes the cost of atomic instructions and/or memory barriers. In a read-mostly computing environment, the overhead associated with these operations falls mostly on readers.
Improved read-side performance is provided by the locking technique disclosed in commonly-owned U.S. Pat. No. 7,934,062, which requires no read-side lock acquisition except when a writer announces its intention to acquire the reader-writer lock. However, the write-side performance of this method can be degraded in systems with many processors. This is because writers must wait for a grace period to elapse before acquiring the reader-writer lock. All processors must pass through a quiescent state that guarantees each reader will have an opportunity to note the writer's locking attempt, and thereby synchronize on the reader-writer lock.
The present disclosure introduces techniques for reducing writer latency in large multiprocessor systems that employ data synchronization mechanisms, such as the grace period-based reader-writer locking approach disclosed in U.S. Pat. No. 7,934,062 or the distributed locking scheme proposed by Hsieh and Weihl. A technique for reducing writer latency in a multithreaded user-mode embodiment of the Hsieh and Weihl distributed locking method is also disclosed. The techniques disclosed herein are also useful for other synchronization operations, such as expedited grace period detection in multiprocessor systems implementing read-copy update (RCU) synchronization.