1. Field of the Invention
The present invention relates to the design of cache memories within computer systems. More specifically, the present invention relates to a method and apparatus for decoupling a tag access from a corresponding data access within a cache memory.
2. Related Art
As processor clock speeds continue in increase at an exponential rate, computer system designers are coming under increasing pressure to perform computational operations at faster clock rates. This can be accomplished by “pipelining” computational operations, which involves dividing each computational operational into a number of smaller operations that can be performed within a single clock cycle. Pipelining allows a number of consecutive computational operations to be processed concurrently by feeding them in lockstep through a set of pipeline stages that perform the smaller operations.
One challenge in designing a pipelined computer system is to efficiently handle variable access times to cache memory. In a typical computer system one or more stages of the pipeline are dedicated to accessing cache memory to perform a load operation or a store operation. Unfortunately, cache access times can vary greatly depending upon if the cache access generates a cache hit or a cache miss.
Even during a cache hit, a number of circumstances can cause a cache access to be delayed. For example, when a cache line is returned during a cache miss operation, a cache fill operation takes place to load the cache line into a data array portion of the cache. Unfortunately, this cache fill operation can conflict with a current cache access from the pipeline, causing the current cache access to stall. In another example, a read-after-write (RAW) hazard may arise because a load operation from the pipeline is directed to a cache line with a pending store operation. In this case, the load operation must wait until the pending store operation completes to ensure that the load operation returns the most current value from the cache line.
In order to alleviate the above-described problems, some caches have been designed so that accesses to the tag array of the cache memory are decoupled from accesses to the data array of the cache memory. Note that a typical cache memory performs a tag lookup into a tag array to compare one or more tags from the tag array with a tag portion of the address. This allows the cache to determine if the desired cache line is located in the cache.
If the tag array access is decoupled from the data array access, it is possible to first perform the tag lookup and comparison to determine if the desired cache line is located in the cache. If so, the tag lookup returns the set and way location of the desired cache line within the data array of the cache memory. If the corresponding data array access is delayed due to contention, the corresponding data array access can take place at a later time when the data array becomes free. This data array access uses the set and way location previously determined during the tag lookup. In this way, the tag array access does not have to be repeated for the subsequent data array access. Furthermore, the tag array access takes a fixed amount of time, which can greatly simplify pipeline design, and can thereby improve pipeline performance.
Unfortunately, existing caches that decouple tag and data accesses do not support out-of-order data returns from cache misses during load operations. It is a complicated matter to support out-of-order returns because a cache line that returns during a cache miss must somehow be matched with the cache access that caused the miss and with all other subsequent accesses to the same cache line, and this matching must take place in an efficient manner.
What is needed is a method and an apparatus for decoupling cache a tag access from a corresponding data access within a cache memory in a manner that efficiently supports out-of-order returns of cache lines during cache miss operations.