1. Field of the Invention
Embodiments of the present invention relate to computer systems. More specifically, embodiments of the present invention relate to techniques for implementing virtual transactional memory using cache line marking.
2. Related Art
Transactional memory is a useful programming abstraction that helps programmers write parallel programs that function correctly and helps compilers automatically parallelize sequential threads. Unfortunately, existing transactional memory systems suffer from limitations on the size of transactions that they can support. This limitation occurs because transactional memory systems use structures which are bounded in size to keep track of information which grows proportionately with the transaction size. For example, in a typical transactional memory system, the processor buffers transactional store operations in a store queue. However, if the transaction generates a large number of stores, the store queue overflows and the processor must abort the transaction.
In order to alleviate this problem, processor designers have suggested different techniques to provide both hardware-based and hybrid hardware-software based support for “unbounded” transactions. For example, the UTM transactional memory protocol proposed by Ananian et al. (see C. S. Ananian, K. Asanović, B. Kuszmaul, C. Leiserson, and S. Lie, Unbounded Transactional Memory, Proceedings of the 11th International Symposium on High-Performance Computer Architecture (HPCA'05), 2005), and the TCC protocol proposed by Hammond et al. (see L. Hammond, V. Wong, M. Chen, B. Carlstrom, J. Davis, B. Hertzberg, M. Prabhu, H. Wijaya, C. Kozyrakis, and K. Olukotun, Transactional Memory Coherence and Consistency, ISCA p. 102, 31st Annual International Symposium on Computer Architecture (ISCA'04), 2004), are both hardware-based techniques that support starvation-avoiding, unbounded transactions. Unfortunately, UTM requires complex hardware which buffers all data overwritten by transactions in memory and automatically searches through linked lists in memory to determine the value to return for loads. Moreover, TCC requires very high bandwidth, because all data stored during each transaction must be broadcast to all other processors. Furthermore, TCC requires that all other processors stop accessing memory whenever a large, starvation-avoiding transaction is being processed.
The Hybrid protocol proposed by Moir et al. (see M. Moir, P. Damron, A. Fedorova, Y. Lev, V. Luchangco, and D. Nussbaum, Hybrid Transactional Memory, Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, (San Jose 2006)), and the LogTM protocol proposed by Moore et al. (see K. Moore, J. Bobba, M. Moravan, M. Hill & D. Wood, LogTM: Log-based Transactional Memory, 12th Annual International Symposium on High Performance Computer Architecture (HPCA-12), 2006), are hybrid hardware-software-based techniques that use hardware for certain transactions but fall back on software for other transactions. More specifically, the Hybrid protocol uses software to run transactions that cannot be completed in hardware (due, for example, to resource constraints), and thus implements a software transactional memory protocol which involves buffering store data in separate data structures until the transaction commits. In contrast, the LogTM protocol requires hardware support to copy old values of certain memory locations that are written within a transaction, and it requires software support to traverse data structures and restore old values of cache lines that were written by transactions that abort. The use of software to implement all or part of the transactional memory system can seriously degrade the performance of the transactional memory system. Furthermore, the hardware support required for LogTM is complex and difficult to implement.
The VTM protocol proposed by Rajwar et al. (see Rajwar, R., Herlihy, M., Lai, K., Virtualizing Transactional Memory, Proceedings, 32nd International Symposium on Computer Architecture 2005 (ISCA '05), 2005), is another hybrid hardware-software based technique that uses hardware to implement transactions that fit in private caches, but maintains a shared data structure with data that has overflowed the private caches in software. The VTM protocol requires that the cache-coherence protocol be modified in order to maintain coherence on virtual addresses.
Hence, what is needed is a processor that can execute unbounded transactions without the problems of the above-described transactional memory systems.