1. Field of the Invention
The present invention generally relates to multi-processor (MP) data processing systems where each processor has to accommodate cross interrogate (XI) requests from other processors and, more particularly, to an efficient technique for resolving priority of requests for reservation of data buses where the storage control elements (SCEs) of requesters are separated from each other by delays of greater than one machine cycle. More specifically, the invention provides an efficient method for priority resolution where two requesters are contending to reserve one of two bidirectional data buses. The basic technique of the invention can be generalized to include a plurality of requesters attempting to reserve a plurality of data buses.
2. Description of the Prior Art
High performance, multi-processor (MP) computer systems are being developed to increase throughput by performing in parallel those operations which can run concurrently on separate processors. Such high performance, MP computer systems are characterized by multiple central processors (CPs) operating independently and in parallel, but occasionally communicating with one another or with a main storage (MS) when data needs to be exchanged. The CPs and the MS have input/output (I/O) ports which must be connected to exchange data.
In the type of MP system known as the tightly coupled multi-processor system in which each of the CPs have their own caches, there exist coherence problems at various levels of the system. More specifically, inconsistencies can occur between adjacent levels of a memory hierarchy. The multiple caches could, for example, possess different versions of the same data because one of the CPs has modified its copy. It is therefore necessary for each processor's cache to know what has happened to lines that may be in several caches at the same time. In a MP system where there are many CPs sharing the same main storage, each CP is required to obtain the most recently updated version of data according to architecture specifications when access is issued. This requirement necessitates constant monitoring of data consistency among caches.
A number of solutions have been proposed to the cache coherence problem. Early solutions are described by C. K. Tang in "Cache System Design in the Tightly Coupled Multiprocessor System", Proceedings of the AFIPS (1976), and L. M. Censier and P. Feautrier in "A New Solution to Coherence Problems in Multicache Systems", IEEE Transactions on Computers, December 1978, pp. 1112 to 1118. Censier et al. describe a scheme allowing shared writable data to exist in multiple caches which uses a centralized global access authorization table. However, as the authors acknowledge in their Conclusion section, they were not aware of similar approaches as described by Tang two years earlier. While Tang proposed using copy directories of caches to maintain status, Censier et al. proposed to tag each memory block with similar status bits.
These early approaches revolve around how to do bookkeeping in order to achieve cross-interrogates (XI) when needed. The idea was to record at the global directory (copies or memory tags) information about which processor caches owns a copy of a line, and which one of the caches has modified its line. The basic operation is to have the global table record (with a MODIFIED bit) status when a processor stores into a line. Since store-in caches are used, the processor cache controller knows, from its cache directory, which lines are modified or private. A store into a non-modified line at a processor will necessitate synchronization with the storage controller and obtaining the MODIFIED status first. Therefore, a storage block cannot be exclusive, or modifiable, for a processor unless the processor has actually issued a store into it, even when the cache has the only copy of the line in the system.
The EX status in a more general sense, as described in U.S. Pat. No. 4,394,731 to Flusche et al., can allow a processor to store into the cache without talking to the storage control element (SCE), even when the line was never stored into the cache. This is a subtle difference but is rather important from a conceptual point of view, since it allows, for example, in an IBM/3081 system, acquiring EX status of a line at a processor when a subsequent store is "likely" to come.
There are various types of caches in prior art MP systems. One type of cache is the store through (ST) cache as described in U.S. Pat. No. 4,142,234 to Bean et al. for the IBM System/370 Model 3033 MP. ST cache design does not interfere with the CP storing data directly to the main storage (or second level cache) in order to always update changes of data to main storage. Upon the update of a store through to main storage, appropriate crossinterrogate (XI) actions may take place to invalidate possible remote copies of the stored cache line. The storage control element (SCE) maintains proper store stacks to queue the main storage (MS) store requests and standard communications between buffer control element (BCE) and SCE will avoid store stack overflow conditions. When the SCE store stack becomes full, the associated BCE will hold its MS stores until the condition is cleared.
Another type of cache design is the store-in cache (SIC) as described, for example, in U.S. Pat. Nos. 3,735,360 to Anderson et al. and No. 4,771,137 to Warner et al. A SIC cache directory is described in detail in the aforementioned U.S. Pat. No. 4,394,731 to Flusche et al. in which each line in a store-in cache has its multi-processor shareabilitv controlled by an exclusive/read only (EX/RO) flag bit. The main difference between ST and SIC caches is that, all stores in SIC are directed to the cache itself (which may cause a cache miss if the stored line is not in the SIC cache). It is also proposed in U.S. Pat. No. 4,503,497 that data transfers upon a miss fetch can take place through a cache-to-cache transfer (CTC) bus if a copy is in the remote cache. A SCE is used that contains copies of the directories in each cache. This permits cross-interrogate (XI) decisions to be resolved at the SCE. Usually, cache line modifications are updated to main storage only when the lines are replaced from the cache.
The connections between SCEs in multi-processor (MP) systems are implemented via cables providing bidirectional data buses. For an MP system having, for example, two SCEs, there may be two bidirectional (BIDI) data buses between the two SCEs. In very large MP systems where the physical connection between SCEs is long, physical packaging restrictions prevent SCEs from communicating with one another in the same machine cycle, whereas the cables between SCEs in prior machines have been short enough that requests and data issued by one SCE reached the other SCE in the same machine cycle. Another contributing factor is the increased processor clock rates of newer MP systems. These higher clock rates mean that even cables that were short enough to allow requests to be communicated in one machine cycle in prior machines now produce a significant delay measured in machine cycles. Communication via the BIDI buses is accomplished by reserving one or the other of the buses by a requestor SCE. However, when the connection between the processors results in a delay that is longer than one machine cycle, the delay caused by the cable length causes a problem in communication between the SCEs. Specifically, there is a problem in resolving the priority of requests for the BIDI buses. For such large machines where communications cannot be accomplished in one machine cycle, what is needed is a technique for efficiently resolving priority of requests for reservation of data buses.