1. Technical Field of the Invention
This invention generally relates to caches for computer systems, such as set associative caches and direct-mapped caches, and more particularly to reducing snoop busy time.
2. Background Art
The use of caches for performance improvements in computing systems is well known and extensively used. See, for example, U.S. Pat. No. 5,418,922 by L. Liu for "History Table for Set Prediction for Accessing a Set Associative Cache", and U.S. Pat. No. 5,392,410 by L. Liu for "History Table for Prediction of Virtual Address Translation for Cache Access", the teachings of both of which are incorporated herein by reference.
A cache is a high speed buffer which holds recently used memory data. Due to the locality of references nature for programs, most of the access of data may be accomplished in a cache, in which case slower accessing to bulk memory can be avoided.
In typical high performance processor designs, the cache access path forms a critical path. That is, the cycle time of the processor is affected by how fast cache accessing can be carried out.
A typical shared memory multiprocessor system implements a coherency mechanism for its memory subsystem. This memory subsystem contains one or more levels of cache memory associated with a local processor. These processor/cache subsystems share a bus connection to main memory. A snooping protocol is adopted where certain accesses to memory require that processor caches in the system be searched for the most recent (modified) version of requested data. It is important to optimize this protocol such that interference as seen by local processors is minimized when snooping occurs. It is also important to move data out of the cache as quickly as possible when a memory access is waiting for cache data resulting from a snoop.
In accordance with an exemplary system, a two level cache subsystem with level 2 (L2) cache line size has some power of 2 larger than level 1 (L1) cache line size is implemented. Both caches implement writeback policies, and L1 is set-associative. L1 is subdivided into sublines which track which portions of the cache line contain modified data. The cache subsystem implements multi-level inclusion wherein all blocks resident in L1 must also be resident in L2. Snoop requests from the bus are received at L2 and, if appropriate, the request is also forwarded on to L1. The snoop request forwarded to L1, however, requires accessing the L1 directory for all of the consecutive L1 cache entries which may contain data associated with the L2 cache line. Each directory access is sent to the L1 cache subsystem as an individual request. Each cache read access resulting from a directory access waits for cache directory information which indicates slot hit and subline offset. Slot hit information can be used in parallel with the cache access but the subline offset is used to generate the address in the cycle before the cache read.
Referring to FIG. 7, an example is given where a single forwarded L2 snoop request requires two L1 directory accesses. Two data transfers out of L1 are required for each directory access because both L1 lines have modified data in both of their sublines. This example demonstrates two problems with the design of this exemplary system.
(1) The processor associated with the L1 cache being snooped is prevented from accessing the L1 cache subsystem when either the L1 directory or cache is being used by a snoop operation. This is illustrated by holding the processor pipe (cache busy) through cycles 1 through 9. Use of these resources occurs in different cycles which extends the overall busy time for the snoop operation.
(2) Delay exists between the transfer of the first and second cache blocks which in turn delays when the memory access associated with the snoops can proceed.
It is, therefore, an object of the invention to reduce the number of cycles required for an L1 snoop operation.
It is a further object of the invention to avoid delays between first and second cache blocks which cause delays in memory access associated with snoops.