FIG. 1 shows a computer system 10. The computer system 10 has one or more processors 11-1, 11-2, . . . , 11-n connected via an associated cache memory 13-1, 13-2, . . . , 13-n to a system bus 16. A shared memory 14 and an I/O bridge 18 are also connected to the system bus 16. The function of each of these devices in the computer system 10 is described below.
The shared memory 14 includes an array of storage locations for storing fixed length data, e.g., eight bit long or byte long data. Each storage location has a unique identifier which is used in data access, i.e., read and write, commands for specifying the particular storage location from which data should be read or into which data should be written. Illustratively, the storage locations are further organized into data line storage locations for storing fixed length (e.g., thirty-two byte long), non-overlapping, contiguous blocks of data called data lines. Each data line storage location has a unique line address similar to the aforementioned addresses for specifying a particular data line storage location to read a data line from or to write a data line into.
The system bus 16 is for transferring data addresses and commands between the devices, i.e., the processors 11-1 to 11-n, cache memories 13-1 to 13-n, shared memory 14 and I/O bridge 18, connected thereto. As shown, the system bus 16 includes a data bus 16-1 for transferring data, a command bus 16-2 for transferring commands and addresses, and an arbitration bus 16-3. The arbitration bus 16-3 is used for allocating the system bus 16 to the devices connected thereto. Illustratively only a limited number of devices can transmit on the system bus 16 at one time. For instance, only one device may transmit a command and address on the command bus 16-2 and only one device may transmit data on the data bus 16-1 at one time (although it may be possible for both devices to use their respective busses simultaneously). Illustratively, the computer system 10 has an elaborate arbitration protocol for allocating the system bus 16 to the devices of the computer system 10 in a fair and orderly manner.
The processors 11-1 to 11-n are for executing program instructions. In the course of executing these instructions, the processors may issue data access, i.e., data read and data write, commands. Furthermore, the program instructions themselves are stored as data in the shared memory 14.
The cache memories 13-1 to 13-n are small high speed memories for maintaining a duplicate copy of data of the shared memory. Despite their relatively small size in comparison to the shared memory 14, the cache memories dramatically reduce the number of data accesses to the shared memory 14. This is because cache memories 13-1 to 13-n exploit temporal and spatial locality of reference properties of processor data accesses. Temporal locality of reference is the tendency of processors 11-1 to 11-n to access the same data over and over. This property arises from program flow control instructions such as loops, branches and subroutines which cause the processors 11-1 to 11-n to repeat execution of certain recently executed instructions. Spatial locality of reference refers to the tendency of processors to access data having addresses near the addresses of other recently accessed data. This property arises from the sequential nature of program instruction execution, i.e., the processor tends to execute instructions in the sequential order in which they are stored as data. In order to exploit this property, cache memories typically store an entire data line corresponding to a recently accessed data. (Herein, a data line is said to correspond to particular data if the data line, or its counterpart in the shared memory, includes at least part of the particular data in question.) Thus, the likelihood increases that the cache memories 13-1 to 13-n can satisfy future accesses to data not yet accessed (assuming that future accesses will be to other data corresponding to the data lines already stored in the cache memories 13-1 to 13-n).
The cache memories 13-1 to 13-n work as follows. When the corresponding processor, e.g., the processor 11-1, issues a data access command, the associated cache memory 13-1 determines if it contains the accessed data. If so, a read or write (depending on whether the processor issued a read or write command) hit is said to occur and the cache memory 13-1 satisfies the processor data access using the copy of the data therein. If the cache memory 13-1 does not contain the accessed data, a read or write miss is said to occur. In the event of a read or write miss, the cache memory 13-1 issues a command for reading the data line corresponding to the accessed data from the shared memory 14. The cache memory 13-1 receives and stores a copy of the data line. The cache memory 13-1 may then utilize the copy of the data line stored therein to satisfy the data access command.
Cache memories 13-1 to 13-n must maintain the consistency of the data in the shared memory 14. That is, while a cache memory 13-1 to 13-n may modify its copy of the data, the counterpart copy of the cache memory's data in the shared memory 14 must invariably be accordingly modified. According to one memory consistent manner of operating a cache memory (e.g., the cache memory 13-1) called write through, the cache memory 13-1 immediately attempts to update the counterpart copy in the shared memory 14 whenever the processor 11-1 modifies the cache memory's 13-1 copy of the data. This manner of operating the cache memory 13-1 is disadvantageous because the cache memory 13-1 must continually use the system bus 16 to access the shared memory 14 each time the associated processor 11-1 modifies the data.
In order to reduce the demands on the slow shared memory 14 and system bus 16, the cache memories 13-1 to 13-n operate in a manner called "write back." According to this manner of operation, each cache memory 13-1 to 13-n defers updating or writing back the modified data line until a later time. For instance, if the cache memory, e.g., the cache memory 13-1, runs out of storage space, the cache memory 13-1 may write back a modified data line to provide an available storage space for an incoming data line. Alternatively, as described in greater detail below, the cache memory 13-1 may write back a data line when another device attempts to read that data line.
The I/O bridge 18 interconnects the system bus 16 and I/O expansion bus 20. One or more I/O devices 22, such as Ethernet interfaces, FDDI interfaces, disk drives, etc., are connected to the I/O expansion bus 22.
The purpose of the I/O bridge 18 is to "decouple" the system bus 16 and the I/O expansion bus 20. Typically, data is transmitted in different formats and at different speeds on these two busses 16 and 20. For instance, data may be transmitted in sixteen byte packets on the system bus 16 at 33 MHz while data is transmitted in four byte groups at 8 MHz on the I/O expansion bus 20. The I/O bridge 18 may receive data packets from a device, e.g., the processor 11-1, connected to the system bus 16, and temporarily store the data of these packets therein. The I/O bridge 18 then transmits the received, "depacketized" data in four byte groups to an I/O device 22 on the I/O expansion bus 20. Likewise, the I/O bridge 18 may receive and temporarily store data from an I/O device 22 via the I/O expansion bus 20. The I/O bridge 18 then transmits the received data in packets to a device, e.g., the shared memory 14, connected to the system bus 16.
The processors 11-1 to 11-n, the cache memories 13-1 to 13-n and the I/O bridge 18 must operate in a manner which maintains the consistency of the data in the shared memory 14. For instance, suppose a first cache memory 13-1 modifies a copy of a data line of the shared memory 14 but does not write the data line back. If a second cache memory 13-2 issues a command to read the same data line, the second cache memory 13-2 should receive the copy of the modified data line in the first cache memory 13-1, not the stale copy stored in the shared memory 14.
To this end, the devices of the computer system 10 implement an ownership protocol. Before a device may access particular data, the device must successfully "claim ownership" in the corresponding data line. A device which does not successfully claim ownership in a data line cannot access the data corresponding thereto.
Illustratively, the ownership protocol is implemented as follows. Suppose the I/O bridge 18 desires to access a particular data line. For instance, when the I/O device 22 desires to write data to the shared memory 14, the I/O bridge 18 must claim ownership in the data lines stored in the destination addresses of the data to be written by the I/O device 22. (In fact, before an I/O bridge 18 can receive each data to be written from the I/O device 22 to the shared memory 14, the I/O bridge 18 must own the corresponding data line.) The I/O bridge 18 first issues a command for claiming ownership in the particular data line on the system bus 16. This ownership claiming command may simply be a command to read or write the particular data line. Each device monitors or "snoops" the system bus 16 for ownership claiming commands. After issuing the ownership claiming command, the I/O bridge 18 also monitors the bus for a specified period. If the another device currently owns the data line for which the I/O bridge 18 issued the ownership claim, this device may issue a response as described below. If, during the specified period, the I/O bridge 18 does not detect a response from another device indicating that the other device already owns the data line, the I/O bridge 18 successfully claims ownership in the data line.
Suppose at the time the I/O bridge 18 issues the ownership claiming command, a cache memory 13-2 already owns, but has not modified the data line. Illustratively, the cache memory 13-2 detects the command issued by the I/O bridge 18. In response, the cache memory 13-2 illustratively concedes ownership of the data line to the I/O bridge 18. To that end, the cache memory 13-2 simply marks its copy of the cache line invalid. At a later time, if the cache memory 13-2 desires to access data corresponding to this data line, the cache memory 13-2 must first claim ownership in the data line.
Alternatively, the cache memory 13-2 may mark the data line shared if the I/O device 18 indicates (from the ownership claim issued by the I/O bridge 18) that it does not desire to modify the data. Furthermore, the cache memory 13-2 issues a command to the I/O bridge 18 indicating that the data line is shared. Two or more devices can share ownership in a data line provided that none of the sharing devices has any intention of modifying the data line (that is, each sharing device wishes to read the data but not write the data). If one of the sharing devices later wishes to modify the data, that device issues an ownership claiming command which causes the other sharing devices to concede exclusive ownership to the device issuing the ownership claim.
Suppose at the time the I/O bridge 18 issues the ownership claim, the cache memory 13-2 already owns, has modified, but has not yet written back the data line in which the I/O bridge 18 attempts to claim ownership. In this case, the cache memory 13-2 first issues an intervention command on the system bus 16. The cache memory 13-2 then writes back its modified copy of the data line to the shared memory 14.
In response to detecting the intervention command, the I/O bridge 18 can do a number of things. These alternatives are illustrated in FIGS. 2 and 3. In FIGS. 2 and 3:
C indicates the issuance of commands by the cache memory 13-2, PA1 I/O indicates the issuance of commands by the I/O bridge 18, PA1 CS# (Command Strobe) indicates when a valid command is transmitted on the command bus 16-2, PA1 CMD is the command signal transmitted on the command bus 16-2, PA1 DS# (Data Strobe) indicates when valid data is transmitted on the data bus 16-1, PA1 ADDR is the address transmitted with the command signal, PA1 DATA is the data returned by the device (e.g., the shared memory 14) in response to the command, PA1 SLD# (Selected) is a signal transmitted by the selected recipient device of the command signal upon receiving the command therein, PA1 CDS# (Cache Data Shared) is a signal instructing an ownership claim command issuer to mark the data line shared. This signal may be transmitted by a device which snoops an ownership claim to an unmodified data line stored therein, PA1 CDM# (Cache Data Modified) is a signal instructing an ownership claim command issuer that another device already owns and has modified the data. This signal is transmitted by a device which snoops an ownership claim to a modified data line stored therein, PA1 CAN# (Command Acknowledge Negative) is a signal indicating that the recipient of the command is busy serving a previous command and that the command issuer should retry its command later, PA1 ORD# is a signal indicating that a memory subsystem intends to transfer data to some processor, PA1 CAE# (Command Address Error) is a signal indicating that either the command or address issued by the command issuer had a protocol, encoding or parity error, PA1 DPE# (Data Parity Error) indicates that data received from the command issuer had a parity error, and PA1 OOR# (Out of Order Response) is a signal instructing a read command issuer that the data requested by the issued read command will be transmitted out of order, i.e., in an arbitrary order in relation to the read command, on the data bus 16-1. PA1 1. Data integrity is assured even in the event one or more receiving agents is busy or detects an error in the transmission of commands, addresses or data during the write back. PA1 2. The write back agent needs only re-write back data until the memory subsystem agent successfully receives a copy of the data. This enables the write back agent and other snarfing agents to resume other processing sooner. PA1 3. The memory reflection scheme works even if the computer system has conventional snarfing agents. While conventional snarfing agents can still slow down the memory reflection scheme (by causing the write back agent to re-write back the data), the snarfing agents adapted according to the present invention will not. Thus, on average, the inventive snarfing agents will still speed up the memory reflection scheme (considering that any agent can experience an error or busy condition).
FIG. 2 is a timing diagram showing various signals generated during a first alternative memory transfer scheme. In FIG. 2, during cycle one of the system clock SCLK, the I/O bridge 18 issues a command for claiming ownership in a data line. This command is detected by the cache memory 13-2 which issues, on cycle four, the signals CDM# and CAN# indicating that it already owns, has modified, but has not yet written back the data in which the I/O bridge 18 attempted to claim ownership. (The shared memory 14 also responds with the SLD# signal to indicate it received the command. However, this event is insignificant as the CDM# and CAN# signals cause the shared memory 14 to abort transmitting data to the I/O bridge 18). The cache memory 13-2 then issues a write command on cycle six and writes back the modified cache line on cycles nine to twelve.
Meanwhile, in response to the CAN# signal, the I/O bridge 18 illustratively reissues its ownership claim on cycle six. The cache memory 13-2 detects this command and issues the CAN# signal on cycle nine to "negatively acknowledge" the command of the I/O bridge 18 (indicating that the command was not acknowledged). Subsequently, the cache memory 13-2 issues a write command on cycle 8 and writes back the data to the shared memory 14 via the data bus 16-1 on cycles nine to twelve. Finally, on cycle eleven, the I/O bridge 18 successfully issues its ownership claiming command. Assuming the I/O bridge 18 issues a read command, the data is returned to the I/O bridge 18 via the data bus 16-1 on cycles seventeen to twenty (not shown).
In the process illustrated in FIG. 2, the I/O bridge 18 must wait until after the cache memory 13-2 writes back the data to the shared memory 14. Then, the I/O bridge 18 can successfully re-issue its ownership claiming command to claim ownership in the data, e.g., read the data from the shared memory 14. This process is disadvantageous because many cycles are utilized to transfer ownership of the data line to the I/O bridge 18. Furthermore, the system bus 16 is utilized twice; once to transfer the modified data from the cache memory 13-2 to the shared memory 14 and once to transfer the data from the shared memory 14 to the I/O bridge 18.
FIG. 3 illustrates an alternative transfer scheme called "memory reflection." As before, the I/O bridge 18 issues its ownership claim command on cycle one. Likewise, the cache memory 13-2 responds on cycle four to indicate that it already owns a modified copy of the data line in which the I/O bridge 18 has attempted to claim ownership. Furthermore, the cache memory 13-2 issues a write command on cycle six and writes back the modified cache line to the shared memory 14 on cycles seven to ten. This is possible because the I/O bridge 18 does not re-issue its command for claiming ownership in the cache line on cycle six. Rather, the I/O bridge 18 enters a tracking mode in which the I/O bridge 18 monitors the command bus 16-2 for the write command issued by the cache memory 13-2. Thus, on cycle six, the I/O bridge 18 can detect the cache memory's 13-2 command and address for writing back the data line in which the I/O bridge 18 unsuccessfully claimed ownership. When the cache memory 13-2 transfers the data to the shared memory 14 on cycles seven to ten, the I/O device 18 "snarfs" or receives the data on cycles seven to ten from the data bus 16-1 at the same time as the shared memory 14.
Stated more generally, the memory reflection scheme is utilized by a "write back agent", a "memory subsystem agent" and one or more "snarf agents." A "write back agent" is a device, such as the cache memory 13-2, which writes back a modified data line. A "memory subsystem agent" is a device, such as the shared memory 14, in which the integrity of data must be maintained. A "snarfing agent" is a device, such as the I/O bridge 18, which attempts to claim ownership in the data line. When the write back agent writes back the data line to the memory subsystem agent, the snarfing agents snarfs the data. The memory reflection scheme requires approximately one half the time of the above process. Moreover, the memory reflection scheme utilizes only one data transfer on the system bus 16 to transfer data to two destinations contemporaneously.
The memory reflection scheme described above presumes that neither the snarf agent 18 nor the memory subsystem agent 14 detects an error in the data written back from the write back agent 13-2. The system bus 16 operates according to the XA-MP protocol. XA-MP does not provide any manner for agents to specify a "broadcast" type of command or data transmission in which multiple receiving agents can be specified. This is problematic if a snarfing or memory subsystem agent generates an error/busy signal DPE#, CAE# or CAN# on the system bus 16 in the memory reflection scheme described above. In response to detecting a CAN# signal indicating that a device which received a data access command is busy, the device which issued the command repeatedly re-issues the command until the recipient device accepts the command. Such a CAN# signal can also be issued by one or more snarfing agents 13-2 or the memory subsystem agent 14 in response to a command issued by a write back agent for writing back a modified data line. Similarly, if a recipient, i.e., snarfing or memory subsystem agent, of a command and address or data detects an error in the received data, the recipient may issue an appropriate error signal CAE# or DPE#, respectively. Such errors may be caused by noise, electromagnetic turbulence and clock skew and tend to occur more frequently when more than one agent receives commands and data from the system bus 16.
Consider the case where only some, but not all, of the recipient agents, i.e., the snarfing agents or the memory subsystem agents, issue an error/busy signal in response to the write back agent writing back its modified data line. In order to guarantee the integrity of data in the computer system 10, the write back agent 13-2 must reissue the modified data line and/or write back command. As a consequence, the write back agent does not relinquish ownership in the data line. Thus, even the agents which successfully receive the data line must suspend further processing of the received data line until the write back agent successfully writes back the data line. For example, Table 1 summarizes the action taken by each of the three classes of agents. Each column shows one of three scenarios: an error/busy signal can only be issued by the memory subsystem agent 14, an error/busy signal can only be issued from a snarfing agent, and an error/busy signal may be issued by both the memory subsystem agent and a snarfing agent simultaneously:
TABLE 1 ______________________________________ error/busy error/busy Error .fwdarw. incurred at error/busy incurred at location memory incurred at memory sub- Agent subsystem snarfing system agent & .dwnarw. agent only agent(s) only snarfing agent ______________________________________ snarfing suspend re-snarf data re-snarf data agent processing or re-snarf data memory re-receive re-receive re-receive data subsystem data data agent write re-write back re-write back re-write back back data and data and data and agent command command command ______________________________________
As can be seen in Table 1, the write back agent 13-2 must re-issue the data and command in each scenario regardless of how many or which agents received correct data. Furthermore, both the memory subsystem and snarfing agents must suspend processing and wait for the write back agent to successfully re-issue its data and command to all snarfing and memory subsystem agents regardless of how many or which agents issued an error/busy signal.
It is therefore an object of the present invention to overcome the disadvantages of the prior art.