This invention relates generally to computer systems and more specifically to writing data displaced from a cache memory back to a main memory subsystem of a computer system.
As it is known in the art, a multiprocessor computer system includes multiple central processing units (CPUs), a main memory and system control logic. Each CPU typically includes a cache for storing data elements that are accessed most frequently. The system control logic provides a communication interconnect for data and commands sent between the CPUs and between the CPUs and main memory. The system control logic often includes an arbitration unit and is coupled to a duplicate tag store. The duplicate tag store holds cached data status information, remote from the CPUs, which is used for maintaining cache coherency in the computer system. The arbitration logic determines the order in which commands are processed in the system control logic.
When a CPU requires a data element that is not stored in its cache, it issues a command to the system control logic. The command is generally referred to as a "readmiss" command which "causes" the system control logic to retrieve the data element from another CPU if the data has been modified by that CPU, or else from main memory.
At the same time, that CPU uses a portion of the data element's address to determine the location in its cache where the requested data element will be placed. When the requested data element will be placed in the same location as a data element that is already stored in the cache, the stored data element must be displaced to make room for the new data. The displaced data element is referred to as a "victim" data element. Typically, if the victim data element has been modified, it is the only valid copy of the data in the computer system and therefore must be written back to main memory. Accordingly, the CPU issues a "victim" command to the system control logic (i.e. a command to write the victim data back to main memory) at the same time that the readmiss command is issued. These victim and readmiss commands constitute a readmiss/victim command pair.
When a readmiss command is received by the system control logic, it is input to the arbitration unit to arbitrate for access to the duplicate tag store and main memory. When access is granted, the system control logic performs duplicate tag store lookup and update operations. Simultaneously, the system control logic accesses the version of the requested data that is stored in main memory.
The results of a duplicate tag lookup operation associated with a readmiss command indicates to the system control logic the location of the most up-to-date copy of a requested data element. The most up-to-date copy may reside in main memory or in another CPU's cache. The duplicate tag store update operation modifies the duplicate tag store entry associated with the requested data element to indicate that the requested data element is stored in the requesting CPU's cache.
If, in response to a readmiss command, a duplicate tag store lookup operation indicates that the copy of a data element in main memory is the most up-to-date, then the system control logic will return the data from memory to the requesting CPU by placing a fill message on its fill queue. However, if the duplicate tag store lookup operation indicates that the most up-to-date copy is in another CPU's cache, then the system control logic issues a request, referred to as a "probe message", to the CPU that has the most up-to-date copy stored in its cache. When that CPU confirms that the requested data is stored in its cache, it initiates a probe response which indicates to the system control logic that the data is ready to be accessed. Subsequently, the system control logic obtains a copy of the data and incorporates it in a fill message which is issued to the requesting CPU thereby providing the requested data.
If a victim command is issued with a readmiss command, then the duplicate tag store update operation associated with that readmiss/victim command pair modifies the appropriate duplicate tag store entry to indicate that the associated victim data is no longer an element in the requesting CPU's cache, and to indicate that the requested data element is stored in the requesting CPU's cache. Also, the system control logic lengthens the main memory access associated with the readmiss command to include a victim write cycle for writing the victim data to main memory.
Sometimes, a first CPU of the computer system issues a readmiss/victim command pair targeting a specified data block while, concurrently, a second CPU issues a readmiss command for that same data element. If the readmiss/victim command pair wins arbitration in the system control logic before the readmiss command, the duplicate tag store is updated to indicate that the first CPU no longer has a copy of the victim data stored in its cache. Subsequently, when the readmiss command wins arbitration, it will therefore be satisfied from main memory. If, on the other hand, the readmiss command wins arbitration before the readmiss/victim command pair, the results of the duplicate tag store lookup associated with the readmiss command will indicate that the most up-to-date copy of the data requested by the second CPU is stored in the first CPU. The system control logic will responsively issue a probe message to the first CPU. In this situation it is essential that the first CPU is able to provide copies of the modified victim data in response to both the victim command and the probe message, to maintain proper system operation.
Providing data in response to readmiss/victim command pairs and probe messages, is further complicated due to the interactions between fill and probe messages. In many prior art systems, fill messages and probe messages travel in different queues between the arbitration unit of the system control logic and the targeted CPU. These queues, which progress at different speeds, are referred to as "probe" and "fill" queues. Because of the difference in progress speeds, the situation can arise wherein a fill message returns data, targeted by the readmiss command portion of a readmiss/victim command pair, to the issuing CPU before a probe message, issued by the system control logic prior to the issuance of the readmiss/victim command, reaches the top of an associated probe queue. This fill will overwrite the copy of the victim data element in the issuing CPUs cache. If the probe message requires access to the victim data element associated with the readmiss/victim command pair, the CPU and/or system control logic must therefore provide a copy of this data from a source other than the cache.
Typically, CPUs include victim data buffers to solve this problem. When a CPU determines that a requested data element will displace another data element (the victim data element) from cache, a victim data buffer is loaded with a copy of the victim data element prior to issuing the readmiss/victim command pair to the system control logic. That copy of the victim data is kept in the victim data buffer until the system control logic determines that pending probe messages that require a copy of the victim data have been satisfied and that the main memory victim write operation has been satisfied.
The above mentioned determination is made using a three step process. The first step involves a determination of whether every probe message in the system, that requires data stored in the victim data buffer, has had an "address comparison" performed. As used herein, the term "address comparison" is a comparison of the target address of each probe message against the address of the victim data, to indicate whether the probe message actually requires a copy of the victim data. The second step involves determining, in the case where the address of at least one probe matched the address of the victim data buffer element, that a copy of the victim data has been transferred to the system in response to the last probe that required the data. The third step involves monitoring the victim write operation that writes the victim data to main memory and monitoring each probe that requires access to the data stored in the victim data buffer to determine when all have been serviced.
Prior art systems have dealt with the second step in this process by implementing a set of "probe buffers", in addition to a set of victim data buffers, in each central processing unit. In computer systems implementing such a solution, the victim data buffers are used exclusively for storing victim data elements until they are written into main memory. The probe buffers are used exclusively for storing victim data elements that are targeted by pending probe messages pending on an associated probe queue. When such a pending probe message targets a data element that is stored in a victim data buffer, a copy of the data is transferred from that victim data buffer to a probe buffer. Accordingly, since a copy of the victim data element remains in the probe buffer, the data can be written to main memory and the victim data buffer deallocated, before all pending probes that target that data have been serviced.
While such a dual-buffer arrangement, with sets of victim data buffers and probe buffers in the CPUs is generally suited to its intended purpose, such an arrangement introduces complexity and requires, in a sense, redundant sets of buffers to hold the same victim data.