Most high performance CPs (i.e. CPUs) have a private high speed hardware managed buffer memory (i.e. cache) for receiving fetched lines from MS to improve the average MS access time for each CP. The cache is usually transparent to a program executing on any CP in the system.
In MP configurations using private store-in-caches (SICs), there is the problem of each CP obtaining the most recently updated version of a MS line, because each CP can change a line in its private cache without correspondingly changing the version of the same line in MS. CPs with private store-through (ST) caches do not have this problem because each update of a line in a ST cache is always correspondingly done in MS. The disadvantage of a ST cache is that all stores (which usually average between ten and twenty percent of all CP requests), are always sent to MS and therefore require substantial MS bandwidth to avoid significant performance degradation. Consequently, the MP level (i.e. the number of CPs sharing MS) is generally very limited by the ST cache, unless a relatively costly high bandwidth MS is used.
The SIC is used in systems where there is insufficient MS bandwidth to make "storing through" a viable solution. SIC caches are described in U.S Pat. Nos. 3,735,360 and 3,771,137 and in application Ser. No. 205,500 entitled "Improved Cache Line Shareability Control For A Multiprocessor" by F. 0. Flusche et al, which are all assigned to the assignee of the present application.
Thus, a ST cache handles stores differently from fetches, that is, a store miss cannot occur because all store requests go to MS, independently of whether the addressed line (target line) is in the cache. Conversely, a SIC cache treats stores and fetches the same; and the line must be in the cache before performing a store or fetch. If the target line is not in the cache (i.e. cache miss), the line is transferred from MS to the cache before performing the fetch or store. Because all subsequent stores to a line take place in the cache (and therefore do not go to MS), the SIC cache substantially reduces the MS bandwidth needed, compared to a ST cache design.
A problem with a SIC cache in a multiprocessing system is that the most current data is often in the caches and not in MS. Consequently, to insure that each CP fetches the most current MS data whenever a CP generates a fetch or store request to its private cache and the target line is not in the cache (i.e. line miss), all CP cache directories must be cross-interrogated to determine if the missed line is present in any other cache (i.e. a remote cache); and, if so, whether the remote copy of the line has been changed (i.e. stored into). If the line is not in a remote cache, the line is fetched from MS to the requesting CPs cache. If the line is found in a remote cache but is not changed, the line is fetched from MS to the requesting CP cache, and the line is invalidated in the remote cache by setting its valid flag to zero. If the line is found in a remote cache and also is changed, the updated line must first be cast-out of the remote cache to MS before invalidating it in the remote cache. After the line is transferred to MS, the requesting CP fetches the line to its cache and then performs the CP store or fetch request.
Copies of the same line which are not changed may be found in multiple CP caches in the MP and be concurrently fetched from, as long as none is stored into.
This movement of a changed line (that takes place on a cross-interrogate hit) entails substantial overhead because the remote CP must send the line to MS or directly to the requesting CPs cache. Thus, the plural CPs encounter inferference and lost time. Even worse, many times the remote CP wants the updated line back shortly after giving it up, and the line ping-pongs between CPs.