1. Field of the Invention
Embodiments of the present invention relate to computer systems. More specifically, embodiments of the present invention relate to a technique for cache line marking with shared timestamps within a computer system.
2. Related Art
Transactional memory is a useful programming abstraction that helps programmers write parallel programs that function correctly and helps compilers automatically parallelize sequential threads. Unfortunately, existing transactional memory systems suffer from certain limitations.
For example, the UTM protocol proposed by Ananian et al. buffers all the old values in a data structure stored in memory (see C. S. Ananian, K. Asanovi'c, B. Kuszmaul, C. Leiserson, and S. Lie, Unbounded Transactional Memory, Proceedings of the 11th International Symposium on High-Performance Computer Architecture (HPCA'05), 2005). Similarly, the LogTM protocol proposed by Moore et al., buffers new values in a private cache, and when this cache overflows, buffers old values of the overflowed cache lines in a data structure stored in memory (see K. Moore, J. Bobba, M. Moravan, M. Hill & D. Wood, LogTM: Log-based Transactional Memory, 12th Annual International Symposium on High Performance Computer Architecture (HPCA-12), 2006).
The transactional memory protocol described in a pending U.S. patent application entitled “Facilitating Efficient Transactional Memory and Atomic Operations via Cache Line Marking,” by the same inventors as the instant application, having Ser. No. 11/655,569, and filing date Jan. 18, 2007 uses cache line marking to improve the performance of systems that support transactional memories (interchangeably called “transactional execution”).
In systems that support cache line marking, threads can place load-marks and store-marks on cache lines to prevent other threads from performing interfering accesses on the marked cache line. For example, when a thread reads from a cache line, the thread can place a load-mark on the cache line. When a load-mark has been placed on the cache line, other threads are not allowed to write to the cache line (although other threads may be allowed to read from a cache line load-marked by another thread). In this way, the marking thread can guarantee the consistency of the transaction without unnecessarily limiting other thread's access to the cache line.
Unfortunately, in systems that support cache line marking, the thread that placed a mark on the cache line is obligated to return to the cache line to remove the mark at the end of the transaction. Returning to the marked cache line can add overhead to transactional execution (consuming additional bandwidth and causing delays). For store marks, the overhead is minimal because the cache line is accessed twice (first to ensure that the cache line is writable/place the store-mark and then again to write the value to the cache line after the transaction has completed) and the second access occurs at an advantageous time to remove the store-mark. On the other hand, load-marked cache lines need only be accessed once (because the value can be read from the cache line in the same operation that places the load-mark on the cache line), but the thread must still return to each load-marked cache line to remove the load-marks after the transaction has completed. Furthermore, systems that require removal of load-marks also require keeping track of the lines that were load-marked, and as a result, the number of lines that a thread can load-mark can be limited by the resources devoted to keeping track of the load-marks.
Hence, what is needed is a processor that supports transactional execution without the above-described limitations.