The performance of a computer system can be enhanced by the use of a memory hierarchy. For example, a three tiered memory can be constructed from low, medium, and high speed memories. A low speed memory may be a magnetic disk for low cost, bulk storage of data. A medium speed memory may be constructed from DRAMs for use as the computer's main or system memory. A high speed memory may employ SRAMs for use as a processor cache memory. The theory behind memory hierarchy is to group instructions and data to be executed by the system processor in the highest speed memory. Such high speed memory is typically the most expensive memory available, so economics dictate that it be relatively small. System memory consisting of DRAMs is denser and less expensive than a cache memory with SRAMs, and can therefore be significantly larger than the cache memory.
In many computer systems, typically large systems, the system memory (DRAM) may be connected to multiple processors, each having its own cache. During operation, each processor transfers instructions and data from system memory to its cache in order to have quick access to the variables of the currently executing program.
As data in a cache is modified by a processor, the data may either be immediately updated in the system memory or it may be updated later. A "write-through" cache writes data to system memory when it is updated whereas a "write-back" cache writes updated data to system memory only when directed. A computer system using only write-through caches always has the most recent data in system memory. In contrast, in a system having write-back caches the most recent data may be in a cache and not the system memory. Thus, whenever a processor wishes to access data from system memory, write-back caches must be checked for the data. If a write-back cache "owns" the data and the data has been modified, it is typically written to system memory so that it can be read by the requesting processor.
For example, consider a system memory which stores multiple byte data elements and a certain data element has been transferred to the cache of a first processor. If a second processor wishes to read all or part of the data element, the fact that the data element is "owned" by the first processor must be determined. This is usually done while attempting to read the data element from memory (a first memory access). If the element has been modified, it is retrieved from the first processor and written into system memory (a second memory access). The data element is then read from memory (a third memory access) and transferred to the second processor. Thus, three memory accesses are required to obtain valid data.
Similar delays are encountered when attempting to overwrite into system memory one or more bytes of a data element owned by another processor. For example, one or more data bytes of a data element are transferred from a first processor to system memory. If the data element is owned by a second processor, modified data bytes or the entire data element are written to system memory. If the modified data bytes from the second processor overwrite those from the first processor, the data bytes from the first processor must again be written to memory.