1. Field of the Invention
This application relates to microprocessor design and, specifically, to cache memory systems in microprocessors.
2. Background Art
The performance of applications such as database and web servers (hereafter xe2x80x9ccommercial workloadsxe2x80x9d) is an increasingly important aspect in high-performance servers. Data-dependent computations, lack of instruction-level parallelism and large memory stalls contribute to the poor performance of commercial workloads in traditional high-end microprocessors.
Two promising approaches for improving the performance of commercial workloads are lower-latency memory systems and the exploitation of thread-level parallelism. Increased density and transistor counts enable microprocessor architectures with integrated caches and memory controllers, which reduce overall memory latency. Thread-level parallelism arising from relatively independent transactions or queries initiated by individual clients enables the exploitation of thread-level parallelism at the chip level. Chip multiprocessing (CMP) and simultaneous multithreading (SMT) are the two most promising approaches to exploit such thread-level parallelism. SMT enhances a traditional wide-issue out-of-order processor core with the ability to issue instructions from different threads in the same cycle. CMP consists of integrating multiple CPU cores (and corresponding level-one caches) into a single chip.
The main advantage of the CMP approach is that it enables the use of simpler CPU cores, therefore reducing overall design complexity. A CMP approach naturally lends itself to a modular design, and can benefit from the on-chip two-level caching hierarchy. In the on-chip two-level caching hierarchy, each first-level cache is associated with and is private to a particular CPU and the second-level cache is shared by the CPUs. However, conventional CMP designs with on-chip two-level caching require the contents of first-level caches to be also present in the second-level caches, an approach known as the inclusion or subset property. With an inclusive two-level caching implementation, an increase in the number of CPUs per die increases the ratio between the aggregate first-level cache capacity and the second-level cache capacity. When this ratio approaches 1.0, nearly half of the on-chip cache capacity can be wasted with duplicate copies of data. Hence, a design that does not enforce inclusion (e.g., an exclusive design) is advantageous and often preferred over the design of inclusive two-level caching.
Exclusive two-level caching has been previously proposed in the context of single processor chips. An example of exclusive two-level caching implemented in a single processor is provided in U.S. Pat. No. 5,386,547, issued to Norman P. Jouppi on Jan. 31, 1995, which is incorporated herein by reference. this invention is the first to address it for CMP systems. This invention also describes new mechanisms to manage effectively a two-level exclusive cache hierarchy for a CMP system.
But, even with exclusive two-level caching, there are performance issues to be addressed in CMP design. Particularly, there is a need to improve mechanisms for effective management of exclusive two-level caching in CMP systems. The present invention addresses these and related issues.
Hence, in accordance with the purpose of the invention, as embodied and broadly described herein, the invention relates to chip multiprocessors (CMP) design. In particular, the present invention provides a system and method that maximizes the use of on-chip cache memory capacity in CMP systems. The system and method are realized with a combination of features. One such feature is a relaxed subset property (inclusion) requirement. This property forms an exclusive cache hierarchy in order to minimize data replication and on-chip data traffic without incurring an increased second level hit latency or occupancy. Another aspect of the combination involves maintaining in the second-level cache a duplicate tag-state structure of all (per-CPU) first-level caches in order to allow a substantially simultaneous lookup for data in the first-level and second-level tag-state arrays.
An additional aspect involves extending the state information to include ownership indication in addition to the data validity/existence indication and data shared/exclusive indication. The ownership aspect lives in the exclusive two-level cache hierarchy and helps orchestrate write-backs to the second-level cache (i.e., L2 fills). Another aspect involves associating a single owner with each cache line in order to eliminate redundant write-backs of evicted data to the second-level cache. Namely, at any given time in the lifetime of a cache line in the CMP chip, only one of its copies can be the owner copy.
Finally, the present invention provides policy-guidelines for administering the ownership and write-back aspects, as the following guidelines exemplify: 1) a first-level cache miss that finds no other copy of a requested cache line becomes the owner of the cache line; 2) a first-level cache miss that does not find a copy of a cache line in the second-level cache but finds it in one or more than one of the first-level caches receives that cache line from the previous owner and becomes the new owner; 3) a first-level cache that replaces a cache line, is informed by the second-level cache whether it is the owner, in which case it issues a second level cache fill; 4) whenever the second-level cache has a copy of the cache line, it is the owner. A first-level cache miss that hits in the second-level cache without invalidating it (i.e., not a write miss) does not steal ownership from the second-level cache; and 5) whenever the second-level cache needs to evict a cache line that is additionally present in one or more first-level caches the second-level cache arbitrarily selects one of these first-level caches as the new owner.
Advantages of the invention will be understood by those skilled in the art, in part, from the description that follows. Advantages of the invention will be realized and attained from practice of the invention disclosed herein.