1. Field of the Invention
The present invention relates to the design of computer systems. More specifically, the present invention relates to a technique for maintaining cache coherence using load-mark metadata.
2. Related Art
In multiprocessor systems, processors often share memory structures that are used to store data. For example, each processor in a multiprocessor system may include a separate local L1 cache, while the processors collectively share an L2 cache and a main memory.
In such systems, a data hazard can arise when multiple processors perform interfering accesses to a shared cache line. For example, an interfering access occurs if one thread loads a value from a cache line while another thread (on the same processor or on another processor) stores a value to the cache line.
In order to prevent such interfering accesses, processor designers have developed systems that can place “load-marks” on cache lines which are accessed by threads. When a cache line is load-marked by a thread, the system prevents other threads from storing values to the cache line, thereby permitting the load-marking thread to read from the cache line without interfering accesses from other threads. In addition, because threads that perform loads do not interfere with each other, some systems allow multiple threads to load-mark (and hence load from) a cache line simultaneously.
Some systems that support load-marking include a load-mark count value (a “reader count”) in the metadata for each cache line. In these systems, when load-marking a cache line, the system increments the value of the cache line's reader count, and when removing a load-mark from a cache line, the system decrements the value of the cache line's reader count. When another thread requests exclusive access to the cache line (e.g., so the thread can store to the cache line), the system sums the reader counts from all copies of the cache line from various caches in the system. If the sum of these reader counts is greater than zero, the cache line is load-marked and the requesting thread is denied exclusive access. This technique that uses a load-mark counter to load-mark cache lines is described in more detail in U.S. patent application Ser. No. 11/635,270, entitled “Efficient Marking of Shared Cache Lines,” by inventors Robert E. Cypher and Shailender Chaudhry.
Other systems include a timestamp that is included in the load-marked cache line's metadata along with the load-mark. In these systems, the system writes the current value of the timestamp into the metadata for a cache line when load-marking the cache line. When another thread requests exclusive access to a cache line, the system determines if there is a load-mark on the cache line and if the cache line has a current timestamp. If so, the requesting thread is denied exclusive access. Eventually, the system updates the value of the timestamp by incrementing the timestamp when a predefined condition occurs. When the timestamp is updated, all load-marks using the previous timestamp are considered “stale” and can be removed from the affected cache lines. Using timestamps for load-marking cache lines is described in more detail in U.S. patent application Ser. No. 11/773,158, entitled “Cache Line Marking with Shared Timestamps,” by inventors Robert E. Cypher and Shailender Chaudhry.
In some systems, when a thread requests exclusive access to a cache line, the system invalidates all copies of the cache line in other caches and forwards a copy of the cache line (including any load-marks) to the requesting thread's processor. If the cache line is load-marked, the requesting thread's processor recognizes that the cache line is load-marked and prevents the requesting thread from storing to the cache line. The load-marking thread(s) must then request a copy of the cache line with read permissions (and replace the load-mark on the cache line, if desired) before they can again read from the cache line. Thus, the invalidations of load-marked copies of the cache line can hamper the performance of threads that have placed or will place load-marks on the cache line.
Hence, what is needed is a system that supports load-marked cache lines without the above described problems.