1. Technical Field
The present invention relates in general to eviction of data from caches in a data processing system and in particular to eviction of data from a cache in a data processing system having a multilevel cache hierarchy. Still more particularly, the present invention relates to eviction of data from one cache to a logically in line cache within a data processing system having a multilevel cache hierarchy.
2. Description of the Related Art
Most contemporary data processing system architectures include multiple levels of cache memory within the storage hierarchy. Caches are employed in data processing systems to provide faster access to frequently used data over access times associated with system memory, thereby improving overall performance. Caches at any level in the storage hierarchy may be private (reserved for a local processor) or shared (accessible to multiple processors), although typically caches at levels closer to the processors are private. Level one (L1) caches, those logically closest to the processor, are typically implemented as an integral part of the processor and may be bifurcated into separate data and instruction caches. Lower level caches are generally implemented as separate devices, although a level two (L2) may be formed within the same silicon die as a processor.
When utilized, multiple cache levels are typically employed in progressively larger sizes with a trade off to progressively longer access latencies. Smaller, faster caches are employed at levels within the storage hierarchy closer to the processor or processors, while larger, slower caches are employed at levels closer to system memory. Logically in line caches within a multilevel cache hierarchy are generally utilized to stage data to and from caches in higher levels of the storage hierarchy. As data is staged or transferred from system memory or caches in lower levels of the storage hierarchy to a cache in a higher level of the storage hierarchy, a replacement policy--typically a least-recently-used replacement policy--is employed to determine which cache locations should be utilized to store the new data. This process, often referred to as "updating" the cache, causes any modified data associated with the cache location selected by the replacement policy (also called a "victim") to be written back to lower levels of the storage hierarchy. The process of writing modified data from a victim to system memory or a lower cache level is called a cast out or eviction.
Accessing system memory generally has a significantly longer latency than that associated with accessing any cache in the storage hierarchy. For example, accessing system memory may require up to four times as many processor cycles as are required to access a level three (L3) cache, and up to 10-15 times as many processor cycles as are required to access an L2 cache. Therefore, data evicted from a cache in any cache hierarchy level other than the lowest is conventionally written to the next lower level of the cache hierarchy rather than to system memory. For example, data cast out of an L2 cache is typically written to an L3 cache via a private bus between the L2 and L3 caches rather than writing the data all the way to system memory. Although latency for a particular operation is minimized in this fashion, such evictions have the effect of keeping the modified data within a localized portion of the storage hierarchy not generally accessible to other devices in a multiprocessor system.
In systems where data is evicted from an L2 cache to an L3 cache via a private bus connecting the two caches, error correction code (ECC) checking is required on the L3 directory and cache to insure that data integrity is preserved. This increases the number of bits required for the bus connecting the two caches. For example, if a 64 bit data bus is employed for transferring data between an L2 and L3 cache, an additional 8 bits may be required for ECC checking, resulting in a 72 bit bus. This larger bus consumes additional area within the silicon and may need to be operated at a lower frequency than the 64 bit bus.
It would be desirable, therefore, to be capable of evicting data from one cache level to a lower level cache without the requirement of a private bus between the two caches, or for ECC checking of data transfers between the two caches. It would further be advantageous to provide a mechanism for such data evictions which allowed the evictions to be visible to the snoop logic of other devices in a multiprocessor system.