FIG. 1 depicts a "multiple memory" structured computer system. In such a system, memory storage is distributed between several system modules. There is always a "main memory" plus there may be various cache memory modules associated with each of the processing elements in the system. This figure shows all the system modules interconnected over a "system bus." There may be, in such a system, additional system bus structures (the Unisys A-11 system provides dual system buses denoted as System Bus A and System Bus B).
FIG. 1 also indicates that the various cache modules provide a path for the respective processor for memory storage information. The cache itself is the first possible source (first choice) for a given piece of addressed information for its processor. If the cache does hold the specific address required by the processor, then a "cache hit" occurs and it sources the necessary data immediately to the processor. However, if the addressed word(s) is not contained within the cache, then it provides the path for the information to be supplied from main memory over the system bus(es).
In a computer system having more than one memory storage facility, a special data integrity challenge can occur. Any computer system having both a main-memory structure and cache-memory(s) is in such a situation (e.g. see multi-memory System I in FIG. 1, having Main Memory linked by system bus means to a number of Processors, 1, 2 . . . N, each with an associated Cache Memory). The potential problem that should be faced is that of maintaining "data coherency," i.e., assuring that all copies of information (data) are identical for a given specific address (in all memory facilities). --a "data integrity" problem.
If computer processors only read data from the memories, this problem would never occur; but, then, little productive work would take place either. Real processors must also modify the values of data that is stored in the system's memories. This is usually referred to as writing memory, or as doing an "overwrite" operation. Some systems provide for several types of write operations. No matter what varieties of write operations are used, the central problem of data coherency arises any time data in one of several memory facilities has been modified.
Most computer systems with cache memories maintain data integrity using an "Invalidation" scheme. This technique requires each cache-memory module to monitor (or spy-upon) the memory operations of all other processors in the system. When any memory "data write" operation is detected, then each of the cache-memories must plan for a potential invalidation operation within its own cache memory. An invalidation operation entails checking the contents of (its own) cache memory for the specific address of the write operation that was detected. If a local cache-memory module discovers that it indeed does contain this (same?) address, it must then mark this address as no longer valid, since some other processor is overwriting its previous value.
This invalidation process, itself, presents a problem for the cache memory module and its associated processor, since it uses cache memory resources. These resources are, of course, present to assist the processor in its work, but the invalidation process may make them "busy" and so hinder the processor from getting its regular work done.
It would be desirable therefore to remember the information necessary for a specific invalidation and execute the process at the least inconvenient time--such is an object hereof.
A second motivation for some sort of remembering device for invalidation data is that the "spying," or monitoring of bus activity, may involve numerous write conditions, and doing so very quickly lest one delay other processors in the system. Therefore, each spy, or cache, must remember all the possible invalidation information for execution at some later time. This we also propose here.
Such a novel "remembering mechanism," is described here--and, for the sake of this discussion, it is called an "INVALIDATION QUEUE" (IQ). The IQ queue structure must take-in, and hole, all pertinent information regarding all memory write operations in the system. It must then provide for the convenient unloading of this information for the invalidation process at the discretion of the local cache-memory module.
FIG. 2 gives a simple block diagram showing a processing element 2-1, a cache block 2-3 and an interface block 2-5 showing the interfaces to the system bus(es). Cache block 2-3 shows a few of the significant elements required for our cache design. These are:
(1) TAG RAM structure 2-3T which holds information as to which of all the possible memory addresses the cache presently contains.
(2) DATA RAM structure 2-3D which holds the actual data values corresponding to the addresses held in the Tag Ram. If a "hit" occurs, the proper data values are immediately driven from the DATA RAM back to processor 2-1.
(3) HIT LOGIC 2-3H to determine whether the required processor data is held in the cache and to drive the appropriate control signal(s) back to processor 2-1 indicating this. If a cache "miss" occurs, then this logic will trigger off the appropriate system bus cycles to retrieve the necessary data from main memory.
(4) INVALIDATION QUEUE block 2-3I to monitor all memory-write type operations on the system bus(es). It must compare all write operation addresses against those in the cache tag ram. If a write address is (not??) contained within the cache, --this is called "invalidating" a cache address.
From FIG. 2, one can infer two major tasks (duties) of the cache module: First, to service the processor for all "hit data". (Find Hits). This function must be accomplished as fast as possible. Each additional clock-cycle lost here significantly reduces the performance of the computer system.
The second major job of the cache is to maintain the integrity of all the data it holds. This task is accomplished by the "invalidation" process, initiated for all write operations on the system bus(es). This task must assure that for any given address value in all storage modules (within the system) the data is identical. If two storage devices ever have differing data for the same address, "data coherency" is lost and the system is in trouble.
These two major cache tasks must both use the same cache resource--namely the TAG RAM facility 2-3T. Each must access TAG RAM and quickly determine whether a given address is currently held within the cache.
Both tasks must also be performed virtually instantaneously when requested. To delay a quick response to the processor is to drastically reduce its performance. And, a delay in system bus invalidation may cause loss of system bus write information.
This invalidation task may use some sort of "queueing" structure to hold a certain number of possible invalidation addresses. However, even the queue must be emptied quickly or risk the mentioned problem.
The conflict between these two tasks (find Hits, invalidate) can be resolved, we believe, by any of the following three cache module implementations;
(1) Provide two separate TAG RAM facilities, one for the "processor-hit" function and the second for "invalidation". This will work; however, the TAG RAM is the most expensive part of the cache block, so workers prefer not to double its size. Additionally, this approach requires extensive logic to assure that the two TAG RAM are always completely and exactly identical to one other (i.e. that they always hold only the same exact addresses).
(2) Another way would be to build a special "dual-ported" TAG RAM structure. This is very expensive if one is to provide & structure large enough for realistic cache devices.
(3) A third approach is the subject of this disclosure, and provides a compromise between performance nd hardware size/cost. This method uses a feature we call "BIT-SLICE-ABILITY," and allows processor servicing and system bus invalidations to proceed coincidently (simultaneously). This approach is a result of bit-slicing the cache implementation, and is derived at virtually no additional cost in added system hardware.
[JM add objects,]
An object hereof is to address at least some of the foregoing problems and to provide at least some of the mentioned, and other, advantages.