1. Field of the Invention
This invention relates to microprocessors and, more particularly, to cache subsystems within a microprocessor.
2. Description of the Related Art
Typical computer systems may contain one or more microprocessors which may be connected to one or more system memories. The processors may execute code and operate on data that is stored within the system memories. It is noted that as used herein, the term “processor” is synonymous with the term microprocessor. To facilitate the fetching and storing of instructions and data, a processor typically employs some type of memory system. In addition, to expedite accesses to the system memory, one or more cache memories may be included in the memory system. For example, some microprocessors may be implemented with one or more levels of cache memory. As used herein, a the level of the cache refers the cache's proximity to the microprocessor core relative to another cache's proximity to the microprocessor core. In this example, the L1 cache is considered to be at a higher level than the L2 cache. In a typical microprocessor, a level one (L1) cache and a level two (L2) cache may be used, while some newer processors may also use a level three (L3) cache. In many legacy processors, the L1 cache may reside on-chip and the L2 cache may reside off-chip. However, to further improve memory access times, many newer processors may use an on-chip L2 cache.
The L2 cache is often implemented as a unified cache, while the L1 cache may be implemented as a separate instruction cache and a data cache. The L1 data cache is used to hold the data most recently read or written by the software running on the microprocessor. The L1 instruction cache is similar to L1 data cache except that it holds the instructions executed most frequently. It is noted that for convenience the L1 instruction cache and the L1 data cache may be referred to simply as the L1 cache, as appropriate. The L2 cache may be used to hold instructions and data that do not fit in the L1 cache. The L2 cache may be exclusive (e.g., it stores information that is not in the L1 cache) or it may be inclusive (e.g., it stores a copy of the information that is in the L1 cache).
During a read or write to cacheable memory, the L1 cache is first checked to see if the requested information (e.g., instruction or data) is available. If the information is available, a hit occurs. If the information is not available, a miss occurs. If a miss occurs, then the L2 cache may be checked. Thus, when a miss occurs in the L1 cache but hits within, L2 cache, the information may be transferred from the L2 cache to the L1 cache in a cache line fill. As described below, the amount of information transferred between the L2 and the L1 caches is typically a cache line. In addition, depending on the space available in the L1 cache, a cache line may be evicted from the L1 cache to make room for the new cache line and may be subsequently stored in L2 cache. If the cache line that is being evicted is in a modified state, the microprocessor may perform a cache line write-back to system memory when it performs the cache line fill. These write-backs help maintain coherency between the caches and system memory.
Memory subsystems typically use some type of cache coherence mechanism to ensure that accurate data is supplied to a requester. The cache coherence mechanism typically uses the size of the data transferred in a single request as the unit of coherence. The unit of coherence is commonly referred to as a cache line. In some processors, for example, a given cache line may be 64 bytes, while some processors employ a cache line of 32 bytes. In yet other processors, other numbers of bytes may be included in a single cache line. If a request misses in the L1 and L2 caches, an entire cache line of multiple words is transferred from main memory to the L2 and L1 caches.
Generally speaking, a lower-level cache such as an L2 cache, for example, may maintain coherency information for a higher-level cache such as an L1 cache. Inclusive cache implementations typically require back-probes of the higher-level cache in response to a variety of lower-level cache accesses. For example, the L2 cache may perform a “back-probe” of the L1 cache in response to receiving a probe to determine if a copy of an L2 cache line exists in the L1 cache. This back-probing of the higher-level cache may reduce the available bandwidth of the cache bus and thus may increase the latency associated with cache accesses.