This invention relates to computers and more specifically to intercomputer interfaces wherein caching schemes are employed. In particular the invention relates to cache coherence protocols.
Cache coherence protocols are unidirectional transfer protocols used to achieve the performance advantages of fast local cache memories while maintaining a flat shared-memory model for a plurality of computer processors operating simultaneously. In a cache coherence protocol, system processes must be controlled such that multiple valid copies of the same data never become different, that is, never become incoherent, without communicating the change directly or indirectly to other processors which have a need to access the data.
The time required to transfer data on high-speed computer interfaces, such as multiple-bit buses, is limited by distance, propagation delays, flux (bandwidth), noise and signal distortion. Both asynchronous and synchronous interfaces have distinctive limitations. The speeds of synchronous buses are limited for example by the propagation delay associated with the handshaking required for each self-contained unit of data. The speeds of synchronous buses are limited by time differences between clock signals and data signals originating at different sources. A traditional bus becomes a bottleneck in a multiprocessor system when many processors need more cumulative bandwidth than is available on the bus.
One approach addressing the fundamental limitations on data flow is the use of packetized unidirectional signalling schemes and distributed cache protocols. These schemes help to reduce bus traffic while providing the high speed processors rapid access to much of the memory space. There is nevertheless an overhead penalty in terms of complexity associated with the controlling protocols.
In some cache coherency schemes there is support for virtual-physical caches, such as those commonly employed in Hewlett-Packard Precision Architecture (HPPA) machines (HP/9000 Series 800 and HP/3000 Series 900). However, such support frequently requires that all caches be homogeneous, that is, they must be of the same depth and have the same indexing mechanism. Certain limited non-homogeneous caching schemes can be accommodated by prior planning. The obvious way to handle generalized nonhomogeneous caches is to pass the entire virtual address as a "cache hint" on or "virtual hint." A virtual hint is some information about the virtual address which can help in finding the tag. However, passing the entire virtual address is costly in terms of time and bandwidth utilization.
What is needed is a method for passing a cache hint while avoiding having to pass the entire virtual address wherein each master on the bus has no need to know how other resources on a bus index their caches.
Cache consistency control is known. A recent patent, U.S. Pat. No. 4,713,755, issued Dec. 15, 1987, to Worely et al., describes one approach for maintaining memory integrity in a system having hierarchical memory by use of explicit software control of caches. The technique therein employs status flags for each block of stored information indicative of valid data and contaminated data. The status flags are employed by the operating system of an individual processor to initiate corrective action to cure the contaminated data.
An IEEE study group project has been proposed by representatives of the computer industry to develop a scalable coherent interface standard for high performance multiprocessor environments. (Gustavson et al , "The Scalable Coherent Interface Project (SuperBus)," Rev. 13, No. 1, Aug. 22, 1988, (IEEE direct circulation, David Gustavson, SLAC Bin 88, P.O. Box 4349, Stanford, Calif. 94309). The approach being proposed is a directory-based coherence mechanism whereby cache controllers and memory cooperate. One of the features of the proposed scheme is the use of a doubly-linked list scheme to support a directory-based cache coherency mechanism. If virtually-indexed caches are employed in the above scheme, it is necessary to have a mechanism to know where the index is to find the tag. However, with such a tag, a check can be made of the bus to determine if there is a coherency hit, that is, a potential conflict with another requestor of data.
In the environment of interest, the bus shared by the processors carries the physical address of the tag and sometimes "virtual hints". Virtual hints have the limitation that they tend to be processor specific.
As the present invention is based on use of a doubly-linked list directory cache coherency scheme, it is believed helpful to provide here a brief description of the operation of such a scheme.
Directory-based cache coherency techniques involve the use of tagged memory, that is, the use of memory tags which indicate which masters contain copies of the data of interest. A master may be any contiguous memory unit, processor cache or bus converter having a need for data. The tags typically are storable values containing information about the status of data. Such information may include whether the data is owned by a remote master as shared (nonexclusive) or private (exclusive) data. A mechanism is also needed to indicate which masters have a copy of the data of interest. One technique, based on the singly-linked list, is to store the master's address of the first such master to have a copy. (The master's address is an address to which a master will respond.) The singly-linked list is distributed among all masters so that each master has one link per valid cache coherence block in the master's cache. A doubly-linked list is formed by adding another link or master's address which aids in the removal of entries from the list. The list is two directional, and the two links are called the nextmaster link and the previousmaster link. The object of the use of the linked list is to prevent corruption of data stored in duplicated contiguous units of sharable information called coherence blocks. A coherence block is that unit of information specified as having to remain internally coherent. It is not necessarily the unit of transferable information.
In a first operation, called a sharable request, a master, such as a cache, requests a sharable copy of a portion of a coherence block, the main memory first checks its tags to determine if any master has a private copy of the requested coherence block. If yes, then the memory attends to removing the private copy before proceeding. If either no copy exists or only a shared copy exists, the memory saves the master's address of the requesting master as the nextmaster link and returns its old nextmaster link to the requesting master so that it can be stored as the nextmaster link there in its directory. The requesting master then also sets its nextmaster link to the value indicating memory. Either a special null master address or the memory's master address can be used to mark each end of the list. A list is thus built whereby all copies of a coherency block can be traced by "patching" of addresses forward through the nextmaster links and by "patching" of addresses backward through the previousmaster links.
In another operation, called a private request, a master requests a private copy of a coherence block. The process is the same as a sharable request, except that the memory must remove any copies then outstanding prior to proceeding.
In another operation, called a shared copy removal, a master removes a shared copy of a coherence block which causes a transaction to be issued to inform the previous master of a new nextmaster link. Thereafter, the next master must also be informed and acknowledgment of completion of both operations must also be received before the master can really forget about the coherence block.
In another operation, called a coherence block forget, the memory instructs all caches to forget about a coherence block by issuing a transaction which walks down the nextmaster links of each cache until the end of the list is found, which points back to the memory. The memory is thereby informed that no cache has retained a copy of the coherency block.
What remains to be solved is how to inform all next and previous masters in such a manner that the current master can actually forget about a line.