This Application relates to the applications entitled:
METHOD AND APPARATUS FOR PERFORMING SPECULATIVE MEMORY REFERENCES TO THE MEMORY INTERFACE (U.S. application Ser. No. 09/099,399, filed Jun. 18, 1998) and
METHOD AND APPARATUS FOR RESOLVING PROBES IN MULTIPROCESSOR SYSTEMS WHICH DO NOT USE EXTERNAL DUPLICATE TAGS FOR PROBE FILTERING (U.S. application Ser. No. 09/099,400, filed Jun. 18, 1998) and
METHOD AND APPARATUS FOR MINIMIZING PINCOUNT NEEDED BY EXTERNAL MEMORY CONTROL CHIP FOR MULTIPROCESSORS WITH LIMITED MEMORY SIZE REQUIREMENTS (U.S. application Ser. No. 09/099,383, filed Jun. 18, 1998) and
METHOD AND APPARATUS FOR PERFORMING SPECULATIVE MEMORY FILLS INTO A MICROPROCESSOR (U.S. application SER. No. 09/099,396, filed Jun. 18, 1998) and
METHOD AND APPARATUS FOR DEVELOPING MULTIPROCESSOR CACHE CONTROL PROTOCOLS USING ATOMIC PROBE COMMANDS AND SYSTEM DATA CONTROL RESPONSE COMMANDS (U.S. application Ser. No. 09/099,398, filed Jun. 18, 1998) AND
METHOD AND APPARATUS FOR DEVELOPING MULTIPROCESSOR CACHE CONTROL PROTOCOLS BY PRESENTING A CLEAN VICTIM SIGNAL TO AN EXTERNAL SYSTEM (U.S. application Ser. No 09/099,398, filed Jun. 18, 1998) and
METHOD AND APPARATUS FOR DEVELOPING MULTIPROCESSOR CACHE CONTROL PROTOCOLS USING A MEMORY MANAGEMENT SYSTEM GENERATING ATOMIC PROBE COMMANDS AND SYSTEM DATA CONTROL RESPONSE COMMANDS (U.S. application Ser. No. 09/099,385, filed Jun. 18, 1998) and
METHOD AND APPARATUS FOR DEVELOPING MULTIPROCESSOR CACHE CONTROL PROTOCOLS USING A MEMORY MANAGEMENT SYSTEM GENERATING AN EXTERNAL ACKNOWLEDGMENT SIGNAL TO SET A CACHE TO A DIRTY COHERENCE STATE (U.S. application Ser. No. 09/099,386, filed Jun. 18, 1998) and
METHOD AND APPARATUS FOR DEVELOPING MULTIPROCESSOR CACHE CONTROL PROTOCOLS USING A MEMORY MANAGEMENT SYSTEM TO RECEIVE A CLEAN VICTIM SIGNAL (U.S. application Ser. No. 09/099,387, filed Jun. 18, 1998).
These applications are filed simultaneously herewith in the U.S. Patent and Trademark Office.
The present invention relates generally to computer processor technology. In particular, the present invention relates to cache coherency for a shared memory multiprocessor system.
A state of the art microprocessor architecture may have one or more caches for storing data and instructions local to the microprocessor. A cache may be disposed on the processor chip itself or may reside external to the processor chip and be connected to the microprocessor by a local bus permitting exchange of address, control, and data information. By storing frequently accessed instructions and data in a cache, a microprocessor has faster access to these instructions and data, resulting in faster throughput.
Conventional microprocessor-cache architectures were developed for use in computer systems having a single computer processor. Consequently, conventional microprocessor-cache architectures are inflexible in multiprocessor systems in that they do not contain circuitry or system interfaces which would enable easy integration into a multiprocessor system while ensuring cache coherency.
A popular multiprocessor computer architecture consists of a plurality of processors sharing a common memory, with each processor having its own local cache. In such a multiprocessor system, a cache coherency protocol is required to assure the accuracy of data among the local caches of the respective processors and main memory. For example, if two processors are currently storing the same data block in their respective caches, then writing to that data block by one processor may effect the validity of that data block stored in the cache of the other processor, as well as the block stored in main memory. One possible protocol for solving this problem would be for the system to immediately update all copies of that block in cache, as well as the main memory, upon writing to one block. Another possible protocol would be to detect where all the other cache copies of a block are stored and mark them invalid upon writing to one of the corresponding data block stored in the cache of a particular processor. Which protocol a designer actually uses has implications relating to the efficiency of the multiprocessor system as well as the complexity of logic needed to implement the multiprocessor system. The first protocol requires significant bus bandwidth to update the data of all the caches, but the memory would always be current. The second protocol would require less bus bandwidth since only a single bit is required to invalidated appropriate data blocks. A cache coherency protocol can range from simple, (e.g., write-through protocol), to complex, (e.g., a directory cache protocol). In choosing a cache coherence protocol for a multiprocessor computer system, the system designer must perform the difficult exercise of trading off many factors which effect efficiency, simplicity and speed. Hence, it would be desirable to provide a system designer with a microprocessor-cache architecture having uniquely flexible tools facilitating development of cache coherence protocols in multiprocessor computer systems.
A present day designer who wishes to construct a multiprocessor system using a conventional microprocessor as a component must deal with the inflexibility of current microprocessor technology. Present day microprocessors were built with specific cache protocols in mind and provide minimal flexibility to the external system designer. For example, one common problem is that a cache of a microprocessor is designed so that a movement of a data block out of a cache automatically sets the cache state for the block to a predetermined state. This does not give a designer of a multiprocessor system the flexibility to set the cache to any state in order to implement a desired cache protocol. Because of this significant complexity is necessarily added to the design of a cache protocol.
In accordance with the present invention, cache coherence is maintained in a multiprocessor system having a plurality of caches and a main memory. To modify a block of one of the caches, a request is sent which corresponds to a coherence state of the block of the cache, typically to a memory management system, to modify the block. The request is a set-dirty request to set the coherence state of the block of the cache to dirty. An acknowledgment responsive to the request is received, typically by the processor local to the cache, the acknowledgment indicating either a grant or denial of permission to modify the block. The block is then modified if the acknowledgment grants permission.
Preferably, the set-dirty request is sent to a controller managing access to the cache and the acknowledgment is received from the controller. The controller managing access to the cache and to which the set-dirty request is sent is either the processor associated with the cache (internal acknowledgment in a uniprocessor system) or a memory management system for managing access to the plurality of caches (external acknowledgment in a multiprocessor system).
The memory management system may alternately be referred to in the computer arts by other names, such as memory controller or an memory management system. The memory management system determines the acknowledgment based on contents of the plurality of caches and the main memory. The contents includes any data or cache state information located in the caches or the main memory and generally known as the state of the memory. The determination of the acknowledgment is carried out according to the particular cache protocol of the multiprocessor system.
In accordance with another aspect, the present invention includes a cache and an external unit. Typically, the cache is divided into a plurality of blocks with each block having a coherence state. The external unit generates a set-dirty request as a function of a coherence state of the block to modify one of the blocks of the cache. The external unit modifies the block of the cache only if an acknowledgment granting permission is received responsive to the request.
In a further aspect, a memory management system manages a plurality of caches of a multiprocessor system. The memory management system receives the set-dirty request from the external unit. The memory management system sends the acknowledgment to the external unit in response to the set-dirty request. The acknowledgment is determined to be either a grant of permission or denial of permission to set the cache block state to dirty based on a state of the plurality of caches.
In a further aspect of the present invention, the request corresponds to a coherence state of the block of the cache. Thus, preferably, a system designer may control which cache modifications require external acknowledgment and which cache modifications require internal acknowledgment. Accordingly, in a further aspect, the set-dirty request is acknowledged internally by the processor independent of the cache state. That is, the controller is the processor and the set-dirty request is sent internally to the processor requesting permission to modify the block of the cache independent of the coherence state of the cache. Typically, this internal acknowledgment mode is useful in the situation where the processor is part of a uniprocessor system.
In a further aspect, the set-dirty request is sent to the memory management system to request permission to modify the block of the cache only if the coherence state of the block is clean. In a still further aspect, the set-dirty request is sent to the memory management system to request permission to modify the block of the cache only if the coherence state of the block is clean/shared. In yet another aspect, the set-dirty request is sent to the memory management system to request permission to modify the block of the cache only if the coherence state of the block is one of clean/shared and clean. In a further aspect, the set-dirty request is sent to the memory management system to request permission to modify the block of the cache only if the coherence state of the block is dirty/shared. In a further aspect, the set-dirty request is sent to the memory management system to request permission to modify the block of the cache only if the coherence state of the block is one of dirty/shared and clean. In a further aspect, the set-dirty request is sent to the memory management system to request permission to modify the block of the cache only if the coherence state of the block is shared. In a further aspect, the set-dirty request is sent to the memory management system to request permission to modify the block of the cache independent of the cache state.
Objects, advantages, novel features of the present invention will become apparent to those skilled in the art from this disclosure, including the following detailed description, as well as by practice of the invention. While the invention is described below with reference to a preferred embodiment(s), it should be understood that the invention is not limited thereto. Those of ordinary skill in the art having access to the teachings herein will recognize additional implementations, modifications, and embodiments, as well as other fields of use, which are within the scope of the invention as disclosed and claimed herein and with respect to which the invention could be of significant utility.