Multi-core processor systems integrate increasing numbers of processor cores onto a single die. Generally, off-chip bandwidth may not increase as quickly as the number of processor cores on the die. Off-chip bandwidth may often be limited by the number of pins interfacing between a chip and its socket or printed circuit board. The limited off-chip bandwidth may be shared among increasingly numerous processor cores.
A multi-core processor system may involve multiple distributed caches. For example, each processor may have a local cache. In an example, two caches may contain local copies of the same memory location in the main system memory. If data stored in one of the caches is modified, the data stored in the other cache and in the main system memory may become stale. As a result, the other cache may be invalidated or modified in the same way. Eventually, these modifications may be reflected in the main system memory. Maintaining integrity and consistency between the local caches and the main system memory may be a significant consumer of off-chip bandwidth. Naive approaches to frequently synchronize all instances of a cache entry and the associated main system memory may involve non-optimal use of limited off-chip bandwidth resources.