The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for a private memory table for reduced memory coherence traffic.
Cache coherence, also referred to as memory coherence, is an issue that affects the design of computer systems in which two or more processors or cores share a common area of memory. In a single processor system, there is only one processing element doing all the work and, therefore, only one processing element that can read to or write from a given memory location. As a result, when a value is changed, all subsequent read operations of the corresponding memory location will see the updated value, even if it is cached.
Conversely, in multiprocessor (or multicore systems, there are two or more processing elements working at the same time, and so it is possible that they simultaneously access the same memory location. Provided none of the processors changes the data in this location, the processor can share the data indefinitely and cache the data as it pleases. But as soon as a processor updates the location, the other processors might work on an out-of-date copy that may reside in its local cache. Consequently, some scheme is required to notify all the processing elements of changes to shared values; such a scheme is known as a “cache coherence protocol,” and if such a protocol is employed the system is said to have “cache coherence.”
The exact nature and meaning of the cache coherency is determined by the consistency model that the coherence protocol implements. In order to write correct concurrent programs, programmers must be aware of the exact consistency models that are employed by their systems. When implemented in hardware, the coherency protocol can be directory-based or employ snooping. Examples of specific protocols are the MSI protocol and its derivatives MESI, MOSI and MOESI.
Protocols incorporated in hardware have been developed to maintain cache coherence. Many multiprocessor systems maintain cache coherence with a snoopy protocol. This protocol relies on every processor or memory controller monitoring (or “snooping”) all requests to memory. Each cache or memory independently determines if accesses made by another processor require an update. Snoopy protocols are usually built around a central bus (a snoopy bus). Snoopy bus protocols are very common, and many small-scale systems utilizing snoopy protocols are commercially available.
Alternatively, to maintain cache coherence across the system, a directory-based protocol uses a directory that contains memory-coherence control information. The directory, usually part of the memory subsystem, has an entry for each main memory location with state information indicating whether the memory data may also exist elsewhere in the system. The directory-based coherence protocol specifies all transitions and transactions to be taken in response to a memory request. Any action taken on a memory region, such as a cache line or page, is reflected in the state stored in the directory.
In addition, the system's memory is much larger than the total data present in the caches. The directory-based coherence protocol tracks only memory regions (cache lines) that are present in one or more caches, and does not have any information on the data that is present only in the memory. The absence of information in the directory implies that the data is not present in the cache.