This invention relates generally to computer systems and, more particularly, to computer systems with a cache memory.
As it is known in the art, modern computer systems use various technologies and architectural features to achieve high performance operation. High performance capabilities can be achieved in computer systems which employ several computer central processing units (i.e., CPUs or processors) arranged on modules in a multiprocessor system configuration. In addition to CPU modules, such a multiprocessor system also includes several I/O modules and memory modules, all coupled to one another by a system bus. The CPUs generally perform co-operative or parallel processing as well as multi-tasking operations for execution of several applications running simultaneously, to provide dramatically improved processing performance. The capabilities of the overall system can be also enhanced by providing a cache memory for each one of the CPUs in the computer system.
A cache memory is a relatively small, yet relatively fast memory arranged in close physical proximity to a processor. Cache memory is generally used to store a subset of the information stored in the main memory or disk. The cache memory generally includes a store to store the actual data as well as a tag store to store tag addresses. The tag store also includes status bits of the cache blocks such as valid and shared.
Use of a cache memory is based on a principle that when a processor accesses a location in memory, there is a high probability that the processor will continue to access memory locations surrounding the accessed location for at least a certain period of time. With cache memory a preselected data block from the relatively slow access time main memory is fetched and stored in the relatively fast access cache memory. Accordingly, as long as the processor continues to access data from the cache memory, the overall speed of operation of the processor is maintained at a level significantly higher than would be possible if the processor had to arbitrate for control of the system bus and then perform a memory READ or WRITE operation, with the main memory module, for each data access.
The capabilities of the multiprocessor computer system can be further enhanced by sharing main memory among the CPUs and by operating the system bus in accordance with a SNOOPING bus protocol.
In shared memory multiprocessor systems, it is necessary that the system store a single, correct copy of data being processed by the various processors of the system. Thus, when a processor writes to a particular data item stored in its cache, that copy of the data item becomes the latest correct copy of the data item. The corresponding data item stored in main memory, as well as copies of the data item stored in other caches in the system, becomes outdated or invalid.
In a write back cache scheme, where a processor writes to it's cache, the data item in main memory is not updated until the processor requires the corresponding cache location to store another data item. Accordingly, the cached data item that has been modified by the processor write operation remains the latest copy of the data item until the main memory is updated. In order to maintain coherence, it is, therefore, necessary to implement a scheme to monitor READ and WRITE transactions on the system bus and insure that modified data is delivered from a processors's cache and the tag status bits are modified accordingly.
One technique uses the well known SNOOPING bus protocol. The SNOOPING bus protocol provides coherency between the various cache memories and the main memory of the computer system by monitoring the system bus for bus activity involving addresses of data items that are currently stored in the processor's cache.
Status bits i.e. valid and share are maintained in tag stores associated with each cache to indicate the status of each data item currently stored in the cache.
One possible status bit associated with a particular data item is a VALID bit. The VALID bit identifies if the cache entry has a copy of a valid data item in it, i.e., the stored data item is coherent with the latest version of the data item, as may have been written by one of the processors of the computer system.
Another possible status bit associated with a particular data item is a SHARED bit. The SHARED bit identifies if more than one cache in the system contains a copy of the data item. A cache element will transition into this state if a different processor caches the same data item. That is, if when SNOOPING on the system bus, a first interface determines that another cache on the bus is allocating a location for a data item that is already stored in the cache associated with the first interface, the first interface notifies the other interface by asserting a SHARED signal on the system bus, signaling the second interface to allocate the location in the shared state. When this occurs the first interface will also update the state of its copy of the data item to indicate that it is now in the shared state.
Another possible status bit associated with a particular data item stored in a cache memory can be what is generally called a DIRTY bit. A cache entry is dirty if the data item held in that entry has been updated more recently than main memory. Thus, when a processor WRITES to a location in its cache, it sets the DIRTY bit to indicate that it is now the latest copy of the data item.
Also, in such a multiprocessor computer systems, for every command/address that some other processor module sends across the system bus, the present processor module would have to look up that address in its primary cache, find out if its in there and determine what action to take in response to the command/address.
To minimize this additional cache lookup activity, one or more duplicate tag (DTAG) stores are provided for each processor module. The tag store mentioned above contains information for use in conjunction with its associated cache memory under control of its processor. The tag information in the DTAG cache on the other hand is for use in conjunction with the system bus.
In prior art systems the DTAG store stored the shared and valid bits but not the dirty bit. Therefore, during system bus transactions the present processor module would look up the address in its DTAG to find out if the address is stored in its cache and determine what action to take in response to the command/address coming along the system bus.
Since there is a cache Tag store which can be associated with a primary or backup cache and a DTAG store, it is the goal of the system that each concurrently contain the same information. However, because of time delays in the system processes there may be a time delay between an update of the Status bit in the DTAG cache and the update of the Status bit in the primary cache.
Therefore, the overall system protocol uses the DTAG cache lookup to determine the actual state of a cache entry. As such, the DTAG status becomes the overall system's "Point of Coherency".
One problem with this approach is that since the duplicate tag store contained only the valid and shared bits, when other processors need to determine whether the present cache contains the most recent copy of the data it must first access the dirty bit which is stored in the tag store associated with the processor or a backup cache. Accordingly, the interface can not directly provide this information. This causes the processor to be continually interrupted and thus affects system performance.