The present invention relates to an apparatus and method for an improved system of cache coherency in a multiple agent system.
In the electronic arts, a processing system may include a plurality of agents that perform coordinated computing tasks. The agents often share one or more main memory units designed to store addressable data for the use of all agents. The agents communicate with the main memory unit and each other over a communications bus during bus transactions. A typical system is shown in FIG. 1. FIG. 1 illustrates a plurality of N agents 10, 20, 30, 40 in communication with each other over an external communications bus 50. Data is exchanged among the agents 10, 20, 30 and the main memory unit 40 in a bus transaction. “Agents” include one or more processors, memory units, and devices that may communicate over the communications bus 50.
In order to improve performance, an agent may include a plurality of tiered internal caches that store and alter data on a temporary basis. In such multiple agent systems, several agents may operate on data from a single address at the same time. Multiple copies of data from a single memory address may be stored in multiple agents. Oftentimes when a first agent must operate on data at an address, a second agent may store a copy of the data that is more current in its internal cache than the copy resident in the main memory unit 40. In order to maintain “cache coherency,” the first agent should read the data from the second agent rather than from the main memory unit 40. Without a means to coordinate among agents, an agent may perform a data operation on a copy of data that is stale.
Along with each unit of data, an internal cache may store additional information, which may include the data's address in the main memory unit 40, the length of the data unit, and/or an indicator as to whether the data has been modified by the agent since being retrieved from main memory. This indicator—known as the “state” of the data—may reflect that the data has been modified or unmodified since being retrieved from main memory. To monitor the states of cache lines, a technique well known in the art such as a “MESI” procedure may be used where each cache line is marked with one of four states. These states are “modified,” exclusive,” “shared,” and “invalid” states (MESI states). The “modified” state indicates that the cache line has an updated version of the data. The “exclusive” state indicates that the cache line is the only one with a copy of the data, i.e., it owns the right to update the data. The “shared” state indicates that other agent(s) may have a copy of the data in the cache line, and there is only a right to read the data. The “invalid” state indicates that the cache line does not have a valid copy of the cache line.
In some agents, modified or exclusive data may be returned to main memory as part of a writeback transaction. In an explicit writeback, an agent generates a bus transaction to write the modified data to external memory in order to make room in the cache for newly requested data. That is, the agent (e.g., 10 in FIG. 1) acquires ownership of the communications bus 50 and drives the modified data on the communications bus 50. The external memory (e.g., agent 40 in FIG. 1) retrieves the data from the communications bus 50 and stores it according to conventional techniques.
By contrast, an implicit writeback typically occurs as part of a transaction initiated by another agent. Consider an example where agent 10 stores a copy of data in modified state; the copy in agent 10 is more current than a copy stored in the main memory unit 40. If another agent 20 posts a request on the communications bus 50 and requests the data, an implicit writeback would cause agent 10 to provide the requested data to agent 20 rather than the main memory unit 40.
In an implicit writeback, when agent 20 posts the request each of the other non-requesting agents performs an internal check (a snoop) to determine whether it possesses a modified or exclusive copy of the data at the requested address in its internal cache system. If a non-requesting agent (agent 10 in the example) does have a modified version of the requested data in its internal cache system it so indicates in a cache coherency signal of the transaction. The agent 10 drives the modified data on the external communications bus 50. The requesting agent 20 and the main memory unit 40 may read the data from the communications bus 50.
In conventional systems, when a cache line or a portion of the cache line is updated at an agent, the agent reads the data associated with the entire cache line as well as other information such as a tag associated with the cache line. This access may be a main memory access or a cache access. The updated data is stored in a buffer associated with the agent receiving the data. The data is stored into the cache line and is written back to another agent or another level of memory such as another level of cache. If all bytes of the cache line are updated, the conventional updating process can be inefficient in that it involves two separate data accesses only one of which is actually useful. In other words, in the first data access, although the data is read, the data is not really used since it is all updated and the second access will write all of the updated data. This results in wasted resources such as wasted power resources as well as increased latencies and bandwidth utilizations since the data read utilizes the processor's resources that would otherwise be used for other processes.