Synchronized access to shared data structures is required in many computer programs in order to ensure data consistency of those shared structures. In many cases, such shared structures are relatively seldom modified, but read quite often. In order to ensure data consistency, such structures can be locked using read/write locks that are exclusive for modification for the underlying data and are shared for reading operations. However, read/write locks are not particularly cheap synchronization primitives and even read access can cause L2-cache misses in the CPU, which in turn, can seriously limit performance of multiple-core computing systems.
Such problems can be alleviated but at the cost of (potentially much) higher memory usage for a single read/write lock. In particular, one memory cache line can be reserved for each CPU core so that shared locks in a corresponding core cache line can be counted when there is no exclusive lock request present.
However, with such an arrangement, at least two problems still remain. First, the exclusive access excludes reading of the shared structure until the corresponding operation is completed. This restriction can lead to performance bottlenecks, especially as modern many-core architectures now regularly exceed 100+CPU cores. In the context of in-memory databases, the problem is even more prominent, because there is no I/O time, which would dominate query execution time. Second, even with optimized read/write locks using one cache line per CPU core, heavy modification load will cause a high ratio of L2 cache misses during exclusive lock waiting. Ideally, shared readers should never be blocked by the modification of internal structures.