1. Field of the Invention
The present invention relates to computer memory systems, and in particular to optimizing the performance of a hardware cache.
2. Background of the Related Art
A memory cache is a computer system component that stores small amounts of instructions and/or data for faster read/write access than provided by larger memory components such as system RAM (random access memory) or a hard disk drive. For example, Level 1 (L1) and Level 2 (L2) cache store data and instructions on behalf of system RAM for fast access by the processor. L1 cache has less storage capacity than L2 cache and is typically built directly into the processor. L1 cache can run at the same speed as the processor, providing the fastest possible access time. L2 cache is typically separate from the processor but provided within the same chip package as the processor. Despite being slower than L1 cache, L2 cache generally has more storage capacity than L1 cache and is still much faster than main memory.
L1 cache typically includes an instruction cache and a data cache. An L1 instruction cache contains a copy of a portion of the instructions in main memory. An L1 data cache contains a copy of a portion of data in main memory, but some designs allow the data cache to contain a version of the data that is newer than the data in main memory. This is referred to as a store-in or write-back cache because the newest copy of the data is stored in the data cache and because it must be written back out to memory when that cache location is needed to hold a different piece of data or is otherwise flushed.
Some systems having multiple processors (or processor cores) include a separate L1 cache for each processor, but share a common L2 cache. This is referred to as a shared L2 cache. Because a shared L2 cache may have to handle several read and/or write operations simultaneously from multiple processors and even from multiple threads within the same physical processor, a shared L2 cache is usually more complex than an L2 cache dedicated to a single processor.
Cache memory may be mapped to the main memory in a variety of ways. Examples of cache mapping known in the art include direct-mapped cache, fully associative cache, and N-way set-associative cache. Direct mapping involves logically dividing main memory according to the number of cache lines provided, so that each logical division of main memory shares a particular cache line. At the other end of the spectrum, fully associative cache allows any cache line to store the contents of any memory location in main memory. N-way set-associative cache involves a compromise between direct mapping and fully-associative mapping, wherein the cache is divided up into multiple “sets” that each contain some number of cache lines (alternately referred to as “ways”). Typically, set-associative cache structures contain 2, 4 or 8 ways per set. A particular memory address is placed into one and only one set, but can be held in any one of the ways within that set.