1. Field of the Invention
The present invention relates to multiprocessor computer systems, and more particularly, to an improved snarfing cache.
2. Description of the Prior Art
A multiprocessor computer system includes a main memory, an input/output (I/O) interface, and several processors (CPUs), each CPU including a cache. A shared interconnect couples the CPUs, main memory, and the I/O interface. Shared memory multiprocessor systems require a mechanism for maintaining cache coherency, i.e., all valid copies of a block of data stored in the various caches of the multiprocessor system must have identical values. Without cache coherency, data consistency problems can arise. For example, if two CPUs locally cache the same data block, and one CPU updates the data block without informing the second CPU, then an inconsistent state exists. If the second CPU performs an operation involving the out of date data (invalid) in its local cache, an incorrect result may be obtained.
A snooping cache is a cache coherency mechanism used to avoid inconsistency in the data stored in the various caches of the multiprocessor system. The snooping cache includes a monitor for monitoring the transactions that occur on the shared interconnect. When a first CPU places a modified data block on the shared interconnect, for example during a write back to main memory, the other caches in the system monitor the transaction, and either invalidate or update any local copies of the data block, depending on the type of snooping cache. As a result, cache coherency is maintained. The problems associated with invalidate and update type snooping caches, are discussed below.
With invalidate type snooping caches, CPU latency is a problem. Latency is the time a CPU remains idle while requested data is retrieved. Consider a first processor (CPU 1) and a second processor (CPU 2), both with a valid copy of block A. If CPU 2 modifies block A, a broadcast is sent out on the shared interconnect and CPU 1 invalidates its copy of block A. If, subsequently, CPU 1 needs a valid copy of block A, CPU 1 must first request a valid copy of block A by initiating a transaction on the shared interconnect. Either another CPU with a valid copy of Block A, or main memory, responds by sending a copy of block A to CPU 1. During the request and response period, CPU 1 is latent, reducing the processing throughput of the multiprocessor system.
With the update type of snooping cache, CPU latency is partially reduced. Whenever one CPU places a valid copy of block A on the shared interconnect, the other CPUs with an invalid copy of block A may "snarl" the valid copy off the shared interconnect and update its cache. For example, if CPU 1 has an invalid copy of block A, and CPU 2 places a valid copy of block A on the shared interconnect before CPU 1 requests the data, CPU 1 will snarl the data. Subsequently, when CPU 1 needs block A, the request and response transaction described above is not required. As a result, if CPU 1 later needs block A, CPU 1 is not latent and traffic on the shared interconnect is reduced.
The prior art update snarfing mechanism, although beneficial in the above example, has its limitations. Snarfing occurs only when a CPU has an invalid copy of a particular data block appearing on the shared interconnect. If CPU I updates block A and then places it on the shared interconnect, and if CPU 2 does not have a copy of block A, then block A is not snarled by CPU 2. When CPU 2 later requests a copy of block A, the above described request response transaction on the shared interconnect is required. Thus, the CPU latency and traffic on the shared interconnect are exacerbated.
The prior an updating snooping cache is less than ideal in the multithreaded multiprocessing environment. Threading is the dividing of a single process into a plurality of related subprocesses or threads. Each thread is executed on one CPU in the multiprocessor system. The threads are interleaved due to the fact that they are derived from the same process, and, therefore, share instructions, files, data and other information. The situation outlined above is a common occurrence in threaded computer environments. Often, a first CPU executing a first thread will place data on the shared interconnect. If another CPU, executing a related thread, has no copy of that block, then snarfing does not take place.