1. Field of the Invention
This invention is directed to digital computers, and more particularly to a multi-processor system in which the processors share a common memory, but in which a processor may obtain an exclusive right to access a respective block of memory. The invention specifically relates to such a multi-processor system in which each processor has a cache memory and follows a cache coherency protocol.
2. Description of the Background Art
Processors in a multi-processor computer system typically communicate via a shared memory. To improve system performance, each processor has a cache memory for temporarily storing copies of data being accessed. Such a hierarchical memory system may follow either a "write through" or a "write back" protocol. In a "write through" protocol, a processor immediately writes data to the shared memory so that any other processor may fetch the most recent memory state from the shared memory. In a "writeback" protocol, a processor writes data to its cache, but this new memory state is written back to the shared memory only when the memory space in the cache needs to be used for different addresses in a cache fill operation, or when another processor needs the new memory state. Therefore, the writeback protocol reduces the number of memory access operations to the shared memory when the new memory state is not needed by the other processors. In general, the write through protocol is preferred when the processors frequently access the same memory addresses, and the write back protocol is preferred when the processors infrequently access the same memory addresses.
Whenever processors communicate via a shared memory, it is desirable to require the processors to follow a protocol insuring that a memory address is not written to simultaneously by more than one processor, or else the result of one processor will be nullified by the result of another processor. Such synchronization of memory access is commonly achieved by requiring a processor to obtain an exclusive privilege to write to an addressed portion of the shared memory, before executing a write operation. In a multi-processor system employing writeback caches, such an exclusive privilege gives rise to a cache coherency problem in which data written in the cache of a processor having such an exclusive privilege might be the only valid copy of data for the addressed portion of memory. A cache coherency protocol is required which permits a processor to obtain readily the valid copy of data as well as the privilege to write to it.
One known cache coherency protocol for a multi-processor system employing writeback caches is based on the concept of block ownership; an addressed portion of memory the size of a cache block is either owned by the shared memory or it is owned by one of the writeback caches. Only one of the processors, or the shared memory, may own the block of memory at any given time, and this ownership is indicated by an ownership bit for each block in the shared memory and in each of the caches. A processor may write to a block only when the processor owns the block. Therefore the ownership bits always identify a unique "valid" block in the system. Shared read-only access to a block is permitted only when the shared memory owns the block. To indicate whether a processor may read a block, each of the caches includes, for each block, a "valid" bit. When a processor desires to read a block that is not valid in its cache, it issues a read transaction to the shared memory, requesting the shared memory to fill its cache with valid data. When a processor desires to write to a block which it does not own, it issues an ownership-read transaction to the shared memory, requesting ownership as well as a fill. From the perspective of the other processors, these transactions are cache coherency transactions, which request any other processor having ownership to give up ownership and writeback the data of the requested block, and in the case of an ownership read transaction, further request the other processors to invalidate any copies of the requested block.
The architecture of the instruction set for a digital computer typically includes certain "interlocked" instructions that are guaranteed to perform atomic operations upon memory in a multi-processing environment. Such "interlocked" instructions, for example, include operands having an access type of "modify", wherein an operand is first read from memory as a source operand, modified, and written back to memory as a destination operand. In a multi-processor system, the access type of "modify" raises the possibility that another CPU might access memory between the time that an operand is read from memory and the time that the operand is modified and written back to memory, leading to a result in memory which might not appear consistent under certain program sequences. To prevent this problem, interlocked instructions have been executed in a multi-processor environment by using the execution unit to request fetching of the operand and to request a memory "read lock" when fetching the second operand from memory, and to request a memory "write unlock" when putting the result back to memory.
Typically, the time for a cache coherency transaction to be transmitted over a system bus is much shorter than the time for fill data to be retrieved from the shared memory. Therefore system performance can be improved by use of a pended bus (i.e., a bus which permits more than one transaction to be pending on the bus at any given time). The use of such a "pended" bus, however, raises the possibility of receiving a number of cache coherency transactions when a cache fill is outstanding. To obtain the improvement in performance of the pended bus, the cache coherency transactions should be executed as soon as possible, but the cache coherency transactions may conflict with the outstanding fill as well as an outstanding lock.
An outstanding lock should be removed as quickly as possible, because it prevents other processors from gaining ownership to the locked memory block. In this regard, a lock is a kind of ownership that is released upon affirmative action by the processor asserting the lock, in contrast to the typical ownership of a cache consistency protocol that releases ownership upon the request of another processor without any affirmative action by the original owner. The requirement of affirmative action by a processor to release its lock increases the likelihood of deadlock in the system, and therefore locking is not used where ownership would suffice.
Although the locking of a memory block could be recorded in the cache in a "lock bit" for each cache block in a fashion analogous to the ownership bits, this technique would increase the size of the cache memory and would burden the cache with additional access cycles for setting and clearing the lock bits.