In general, false sharing (FS) is a harmful by-product of multithreaded applications running on multiprocessor (including multi-core) architectures. FS can have adverse implication on performance in that FS is a performance degrading usage pattern that arises in systems with distributed, coherent caches at the size of the smallest resource block managed by the caching mechanism.
Cache coherence (i.e., cache coherency) refers to the consistency of data stored in local caches of a shared resource. Cache coherence is a special case of memory coherence. When clients in a system maintain caches of a common memory resource, problems may arise with inconsistent data. This is particularly true of CPUs in a multiprocessing system. Referring to FIG. 1, for example, if client 1 has a copy of a memory block in memory resource 100 from a previous read and client 2 changes that memory block, client 1 could be left with an invalid cache of memory without any notification of the change. Cache coherence is intended to manage such conflicts and maintain consistency between cache and memory. The coherency is typically accomplished at cache block or cache line level.
When a first client or other system participant attempts to periodically access data that will never be altered by a second client or party, but that data shares a cache block with data that is altered, the caching protocol may force the first client to reload the whole block despite a lack of logical necessity, thus the reference to the term False Sharing. The caching system is unaware of the precise activity within this block and forces the first client to bear the caching system overhead required by true shared access of a resource.
As such, FS is typically a concern in multiprocessor CPU caches, where memory is cached in lines of some small power of two word size (e.g., 64-byte lines aligned on 64-byte boundaries), for example. If two processors operate on independent data in the same memory address region storable in a single line, the cache coherency mechanisms in the system may force the whole line across the bus or interconnect with every data write, forcing memory stalls in addition to wasting system bandwidth.
FS is an inherent artifact of automatically synchronized cache protocols and can also exist in environments such as distributed file system or databases. The main prevalence, however, is in multiprocessor memory hierarchy sub-systems, where memory data is replicated and resides in the caches of several CPUs. Memory data is placed in a cache at the granularity of a cache line.
By way of example, false sharing happens, when CPU X writes to object A in a cache line also containing object B. This action invalidates cache lines in other CPUs (e.g., CPU Y) that contain copies of corresponding memory object B. Thus, when CPU Y accesses object B a cache miss occurs with the penalty of having to retrieve the data from a lower and slower memory level or from another cache with coherent data. It is noteworthy that CPU Y doesn't write to object B. The sharing is an artifact of the memory implementation. The above anomaly is referred to as False Sharing, as there is sharing in the cache lines but no sharing on the objects within the cache line.