The performance of a computer system can be enhanced by the use of a memory hierarchy. For example, a three tiered memory can be constructed from low, medium, and fast memories. A low speed memory may be a magnetic disk for low cost, bulk storage of data. A medium speed memory may be constructed from DRAMs for use as the computer system's main memory. A fast memory may employ SRAMs for use as a processor cache memory. The theory behind memory hierarchy is to group code (instructions) and other data to be executed by the system processor in the highest speed memory. Since fast memory is typically the most expensive memory available, economics dictate that it be relatively small. Main memory consisting of DRAMs is denser and less expensive than a cache memory with SRAMs, and can therefore be significantly larger than the cache memory.
During operation, instructions and other data are transferred from system memory to the cache memory in order to have quick access to the variables of the currently executing program. As additional data, not in the cache, is required, such data is transferred from the main memory by replacing selected data in the cache. Various replacement algorithms are utilized to determine which data is replaced.
By definition, an efficiently operating cache architecture is one which exhibits a high ratio of "hits" to accesses. A "hit" occurs when data requested is in the cache. A number of factors influence the hit ratio. The dominate factor is the locality of reference of the code being executed. In other words, if the code is located in proximate physical locations in memory, the hit ratio will be higher than if the code is widely distributed throughout memory. Another factor influencing the hit ratio of a cache is the number of devices having access to the memory. If only a single bus master, such as the system processor, has access to the memory, the data stored in the cache can be controlled to achieve a reasonably high hit ratio. However, when more than a single bus master has access to the memory through the same cache, the cache can bounce back and forth between requests from the bus masters, greatly reducing the hit ratio. In other words, the cache is non-discriminatory, with the demands of the system processor and other bus masters affecting the cache equally. One operation can significantly impact the data make-up of the cache. For example, data cached in response to memory accesses from a non-host CPU bus master will overwrite data needed by the host processor.
Another factor affecting the hit ratio relates to the fact that both code and non-code data are cached. Blocks of data in the system memory are mapped into different physical locations in the cache. If each block of data in system memory may be mapped to only a single location, the cache is known as a direct mapped cache. Set associative mapping involves each block of data being mapped to more than a single location. For example, if each block of data may be mapped to either of two locations, the cache is known as two-way set associative. Irrespective of the number of locations available for a system memory block, when both code and non-code data are being cached, there will be overlap in their respective mappings. Thus, when both code and non-code data are cached, there can be significant thrashing which takes place as data is replaced in response to memory accesses.
An issue related to cache hits is whether or not a data element in a cache is valid and/or dirty. A data element is valid as long as no corresponding element in the system memory is more current. If the data element in the cache contains multiple data bytes, the concept of validity may extend down to the byte level. In other words, it is possible for certain selected bytes of a data element to be valid while other bytes in the same element are invalid. A data element may become wholly or partially invalid if all or some of its data bytes, respectively, are written to the system memory from a bus master while the data element resides in the cache. A data element in a cache is dirty if it is more current than a corresponding element in the system memory. A data element becomes dirty when a bus master writes the element to a cache and not to system memory.
A cache "overflow" is the condition created when a dirty and valid data element in a cache is overwritten by a bus master. To prevent the dirty and valid element from becoming lost when overwritten, the element must be written immediately into the system memory from the cache. (The efficient handling of a cache overflow is described in copending U.S. patent application Ser. No. 563,220 entitled "Computer Memory System and Method for Enhancing Performance on Cache Overflows".) An overflow write can force several individual writes, depending on the line size of the cache.
In prior memory systems, the act of cleaning requires accesses to the cache to retrieve valid and dirty elements. This results in a wait state in which the bus master attempting to write to the subject cache location must wait until the data element being cleaned is first retrieved and written to the system memory. Such prior systems require additional logic to prioritize and efficiently locate and retrieve all dirty and valid elements residing in the cache.