Referring to FIG. 1, a computer system 10 typically includes at least a processor 12 having a central processing unit (“CPU”) 14 for processing data (e.g., executing instructions, performing arithmetic operations). Generally, data needed for CPU 14 operations are directly or indirectly supplied from a main memory 16 operatively connected to the processor 12 by, for example, a bus (i.e., a collection of conductive pathways over which data propagates).
Because the speed at which data is obtained from main memory 16 (i.e., “memory latency”) is typically significantly slower than the speed at which the CPU 14 is capable of processing data, there is a potential for much of the CPU's 14 processing time to be wasted. In other words, the CPU 14 may spend a considerable amount of its processing time “waiting” on the main memory 16.
At least partly in order to counteract the effects associated with large, slow main memories, smaller, faster memories known and referred to as “cache” memories are often used. A cache memory generally contains data (and addresses thereof) of memory locations that are frequently or has been recently used by a requesting entity (e.g., a processor). Cache memories are searched for needed data prior to searching for that data in main memory.
Still referring to FIG. 1, the CPU 14 is operatively connected to an “on-chip” cache memory 18 (cache memory 18 said to be “on-chip” due to it being disposed on processor 12). When the CPU 14 needs data for a particular processing operation, the CPU 14 sends a request for the data first to the “on-chip” cache memory 18. This request may be in the form of an address of the needed data. If an address of the requested data is found in the “on-chip” cache memory 18, a “cache hit” is said to have occurred, and the data associated with the searched-for address is returned to the CPU 14. If the address of the requested data is not found in the “on-chip” cache memory 18, a “cache miss” is said to have occurred, in which case, the request from the CPU 14 is effectively forwarded to an “off-chip” cache memory 20 operatively connected to the processor 12 (cache memory 20 said to be “off-chip” due to it being external to processor 12). If the address of the requested data is found in the “off-chip” memory 20 (i.e., a “cache hit” occurs), the data associated with the searched-for address is returned to the CPU 14. Otherwise, if the address of the requested data is not found in either the “on-chip” cache memory 18 or the “off-chip” cache memory 20, the requested data is retrieved from the relatively slow main memory 16.
Those skilled in the art will note that are various types of cache memories. FIG. 2 shows a type of commonly used cache memory 30. A description of the cache memory 30 is set forth below with reference to an address of data requested by a requesting entity (e.g., a processor), the address having 32 bits.
The cache memory 30 is formed of a tag store 32 and a data store 34. The tag store 32 is formed of x sets each having n cache blocks or “ways” (such a cache memory said to be “n-way set-associative”). A set is selected based on the “index” field (i.e., addr[12:6]) of the address of the requested data. Once a set is selected using the “index,” tags in the ways of the selected set are compared against a “tag” field (i.e., addr[31:13]) of the address of the requested data (this process known and referred to as a “tag match”). If there is a match between one of the returned tags and the “tag” field of the address of the requested data, data from a corresponding way in the data store 34 is returned, where this corresponding way is part of a set selected from among y sets using the “index” and “offset” fields (i.e., addr[8:3] for 8 bytes of data) of the address of the requested data.
In order to make room for a new entry on a “cache miss,” the cache memory generally has to “evict” one of the existing entries. The heuristic that the cache memory uses to choose the entry to evict is known and referred to as the “replacement policy.” The fundamental problem with any replacement policy is that it must predict which existing cache entry is least likely to be used in the future. There are a variety of replacement policies to choose from and no particular one is perfect. One popular replacement policy replaces the least recently used (“LRU”) entry.
When data is written to the cache memory, the data must at some point be written to main memory as well. The timing of this write is controlled by what is known and referred to as the “write policy.” In a “write-through” cache memory, every write to the cache memory causes a write to main memory. Alternatively, in a “write-back” cache memory, writes are not immediately mirrored to main memory. Instead, the cache memory tracks which locations have been written over (these locations are marked “dirty”). The data in these locations is written back to main memory when that data is evicted from the cache memory. For this reason, a “cache miss” in a “write-back” cache memory will often require two memory accesses to service.
Those skilled in the art will note that the type of cache memory 30 shown in FIG. 2 may be a type of cache memory used to implement, for example, the “on-chip” cache memory 18 shown in FIG. 1. On the other hand, the “off-chip” cache memory 20 shown in FIG. 1 is often implemented as a “victim” cache (i.e., a cache memory that holds cache lines evicted from a higher-level cache memory).