FIG. 1 illustrates, in block diagram form, a typical prior art multi-processor system 30. System 30 includes a number of processors, 32a, 32b, 32c, coupled via a shared bus 35 to memory 36. Processors 32 execute program instructions out-of-order (OOO). Each processor 32 has its own non-blocking cache 34.
Each cache 34 is N-way set associative. In other words, each cache index defines a set of N cache entries, also referred to as N ways. Each cache index way includes data and a tag to identify the memory address with which the data is associated. Additionally, MOSI bits are associated with each item of data in a cache to maintain cache coherency by indicating the MOSI state of the data entry. According to the MOSI protocol, each cache data entry can be in one of four states: M, O, S, or I. The I state indicates invalid data. The owned state, O, indicates that the data associated with a cache index is valid, has been modified from the version in memory, is owned by a particular cache and that another cache may have a shared copy of the data. The processor with a requested line in the O state responds with data upon request from other processors. The shared state, S, indicates that the data associated with a cache index is valid, and one or more other processors share a copy of the data. The modified state, M, indicates valid data that has been modified since it was read into cache and that no other processor has a copy of the data.
MOSI states help determine whether a cache access request is a miss or a hit. A cache hit occurs when one of the ways of a cache index includes a tag matching that of the requested address and the MOSI state for that way is not I. A cache miss occurs when none of the tags of an index set matches that of the requested address or when the way with a matching tag contains invalid data. Within system 30, at the time a miss is detected a determination is made whether a write-back is required. This determination is based upon MOSI state. A write back is necessary when a request misses in the cache and the index way assigned for the cache fill contains modified data; i.e., is in the M or O state.
FIG. 2 illustrates how MOSI states transition in response to various types of misses. In system 30, at the time the need for a write back is recognized, the line is invalidated by changing the way's MOSI bits to the I state. The MOSI state of the way will again be changed from I to another state upon completion of the fill that precipitated the write back.
This policy of changing MOSI bits to I at the time of write back determination can lead to data loss when more than N outstanding store misses are permitted in a non-blocking, N-way set associative cache, as is the case in system 30. Table I of FIG. 3 illustrates how these two factors can lead to data loss by overwriting modified data without first writing it back. Table I illustrates how the MOSI bits and data of one index change in response to a series of store misses for that index. At time .tau..sub.0, all four ways of index A store modified data. At time .tau..sub.1 when the first store, St1 A, misses Way0 is assigned for the required fill. The MOSI state of Way0 indicates a write back is necessary. In anticipation of the yet-to-be-completed write back, the MOSI state of Way0 is changed from M to I. Similar events occur at .tau..sub.2 for way 1, .tau..sub.3 for Way2, and at .tau..sub.4 for Way3. When the N+1th store misses occurs at .tau..sub.5 way 0 is again assigned for the necessary fill. Because the fill associated with St1 A has not yet completed, the MOSI state of Way0 is still I, indicating that a write back of the data in Way0 is not necessary. Subsequently, at .tau..sub.6 the fill associated with St1 A is completed, writing data item D1 into Way0 and changing its MOSI state to M. At .tau..sub.7 the fill associated with St5 A is completed writing data item D5 over D1. Data item D1 has been overwritten without the write back, even though its MOSI state is M. This data loss occurred because the determination of whether a write back was made while a previous fill for the same way was still pending.
One possible solution to avoiding overwriting modified data without a write back is to stall selection of store requests when there are N outstanding store misses. This eliminates the possibility that two fills can be pending at the same time for a single way. To illustrate the effect of this stall policy, consider the situation immediately after start-up when the tag bits for each cache entry represent the I state. In this situation the first N store requests will result in N outstanding cache misses. System 30 will respond to the Nth cache access request by stalling, without regard to the cache index associated with each cache access request. If all N store requests are to the same index then the stall was necessary to prevent data loss. However, if just one of the N pending cache accesses is for a different cache index then the stall is unnecessary because there was no danger of data loss. As used herein, a stall is unnecessary in a non-blocking, N-way set associative cache when there are fewer than N outstanding misses for any one cache index. While the performance penalty per unnecessary stall is small, it is incurred so frequently that the overall cost of unnecessary stalls is undesirable.
Thus, a need exists for a cache controller for a non-blocking, N-way set associative cache using a write-invalidate cache-coherency protocol that avoids overwriting cache data in the M or O states without first performing a write back.
A need also exists for a cache controller that reduces unnecessary cache stalls while preventing data loss possible when write back decisions are made at the time of miss detection.
A need exists for a cache controller that accounts for the cache indices associated with outstanding cache misses when determining whether to stall selection of cache access requests so that only necessary stalls are initiated.
A further need exists for a cache controller that reduces the duration of necessary stalls.