The present invention relates to a cache memory data replacement strategy and, more particularly, to a data replacement strategy for an n-way set associative cache memory.
Memory caching is a widespread technique used to improve data access speed in computers and other digital systems. Cache memories are small, fast memories holding recently accessed data and instructions. Caching relies on a property of memory access known as temporal locality. Temporal locality states that information recently accessed from memory is likely to be accessed again soon. When an item stored in main memory is required, the processor first checks the cache to determine if the required data or instruction is there. If so, the data is loaded directly from the cache instead of from the slower main memory. Due to temporal locality, a relatively small cache memory can significantly speed up memory accesses for most programs.
FIG. 1 illustrates a processing system 100 in which the system memory 110 is composed of both a fast cache memory 120 and a slower main memory 130. When processor 140 requires data from the system memory 110, the processor first checks the cache memory 120. Only if the memory item is not found in the cache memory 120 is the data retrieved from the main memory 130. Thus, data which was previously stored in the cache memory 120 can be accessed quickly, without accessing the slow main memory 130.
Cache memories must also handle the problem of ensuring that both the cache memory and the main memory are kept current when changes are made to data values that are stored in the cache memory. Cache memories commonly use one of two methods, write-through and copy-back, to ensure that the data in the system memory is current and that the processor always operates upon the most recent value. The write-through method updates the main memory whenever data is written to the cache memory. With the write-through method, the main memory always contains the most up to date data values. The write-through method, however, places a significant load on the data buses, since every data update to the cache memory requires updating the main memory as well. The copy-back method, on the other hand, updates the main memory only when modified data in the cache memory is replaced. When data stored in the cache memory is modified by a connected processor, the processor updates only the data in the cache memory. When cached data is replaced, the main memory value is changed only if the data has been modified while being cached, that is if the main memory and cache memory values are different. The cache memory commonly stores an indicator, known as the dirty bit, for each location in the cache memory. The dirty bit shows whether the data in that location should be updated in main memory when the data is replaced in the cache. Copy-back caching saves the system from performing many unnecessary write cycles to the main memory, which can lead to noticeably faster execution.
There are currently three prevalent mapping strategies for cache memories: the direct mapped cache, the fully associative cache, and the n-way set associative cache. In the direct mapped cache, a portion of the main memory address of the data, known as the index, completely determines the location in which the data is cached. The remaining portion of the address, known as the tag, is stored in the cache along with the data. To check if required data is stored in the cached memory, the processor compares the main memory address of the required data to the main memory address of the cached data. As the skilled person will appreciate, the main memory address of the cached data is generally determined from the tag stored in the location required by the index of the required data. If a correspondence is found, the data is retrieved from the cache memory, and a main memory access is prevented. Otherwise, the data is accessed from the main memory. The drawback of the direct mapped cache is that the data replacement rate in the cache is generally high, thus reducing the effectiveness of the cache.
The opposite policy is implemented by the fully associative cache, in which cached information can be stored in any row. The fully associative cache alleviates the problem of contention for cache locations, since data need only be replaced when the whole cache is full. In the fully associative cache, however, when the processor checks the cache memory for required data, every row of the cache must be checked against the address of the data. To minimize the time required for this operation, all rows are checked in parallel, requiring a significant amount of extra hardware.
The n-way set associative cache memory is a compromise between the direct mapped cache and the fully associative cache. Like the direct mapped cache, in a set-associative cache the index of the address is used to select a row of the cache memory. However, in the n-way set associative cache each row contains n separate ways, each one of which can store the tag, data, and any other required indicators. In an n-way set associative cache, the main memory address of the required data is checked against the address associated with the data in each of the n ways of the selected row, to determine if the data is cached. The n-way set associative cache reduces the data replacement rate (as compared to the direct mapped cache) and requires only a moderate increase in hardware.
When an n-way set associative cache is used, a replacement strategy must be selected for determining which of the n ways is replaced when data is written to a row where all of the ways contain valid data. The way to be replaced may be selected randomly, but it has been found that in many cases a replacement scheme known as LRU (least recently used) is more effective. LRU takes advantage of the temporal locality property, and assumes that the data that is least likely to be needed in the future is the data that has not been used (i.e. read from or written to in the cache) for the longest period of time. With LRU, the cache memory records the relative order in which the data in each of the ways was used, for every row in the cache memory. When data is replaced in a given row, the processor uses this information to determine which of the ways contains the least recently used data, and replaces the data accordingly.
In many cases LRU is an effective replacement strategy. However LRU fails to account for certain factors that may influence the time required for data replacement. One of these factors is the type of memory hardware used for the main memory. Some memory chips, for example dynamic random access memories (DRAMs), have a page structure. The time required for a memory read/write operation depends on the page to which the operation is directed. In a DRAM, read and write operations can only be performed on the currently open page. When a read or write operation is required to a closed page, the required page is first opened by latching to a page buffer in an activation cycle. Read/write operations can then be performed to the open page. In order to subsequently perform a read/write to a different, currently closed, page, the open page is first closed with a precharge cycle, and the new page is then latched to the page buffer with a second activate cycle. It is therefore more efficient to read and write data to the currently open page than to a new page. The time required for each read/write operation to a new, currently closed, page is three cycles, as opposed to a single cycle for a read/write operation to a currently open page.
Many replacement policies have been proposed, however they do not take the main memory status into consideration when optimizing the data replacement strategy. For example, Wickeraad, et al. in U.S. Pat. No. 6,490,654 present a method and apparatus for cache line (way) replacement which associates a count entry with each cache line. The count entry defines a replacement class for the line. Data which is likely to be needed soon is assigned a higher replacement class, while data that is more speculative and less likely to be needed soon is assigned a lower replacement class. When the cache memory becomes full, the replacement algorithm selects for replacement those cache lines having the lowest replacement class. Accordingly, the cache lines selected for replacement contain the most speculative data in the cache. Wickeraad, et al. base their replacement strategy only upon a predicted likelihood that the cached information will be needed, without consideration of main memory constraints.
A prior art replacement strategy based on the status of the cache memory is presented in Egan U.S. Pat. No. 6,327,643, which provides a system and method for cache line replacement that differentiates between cache lines that have been written with those that have not been written. The replacement algorithm attempts to replace cache lines that have been previously written back to memory. If there are no written cache lines available, the algorithm attempts to replace cache lines that are currently on page and on bank within the cache memory. The invention considers only the state of the cache memory, not the state of the main memory.
Lesartre in U.S. Pat. No. 6,405,287 provides a cache line replacement strategy that uses cached data status to bias way selection, and to determine which way of an n-way set associative cache should be filled with replacement data when all of the ways contain valid data. According to Lesartre's method, a first choice for way selection and at least one additional choice for way selection are generated. If the status of the way corresponding to the first choice differs from a bias status, a way corresponding to one of the additional choices is designated as the way to be filled with replacement data. Otherwise, the way corresponding to the first choice is designated as the way to be filled with replacement data. Status information for a given way may include any data which is maintained on a cache line by cache line basis, but is preferably data which is maintained for purposes other than way selection. Cache line status information is defined as any information which is tracked in conjunction with the maintenance of a cache line. For example, status information might include indications as to whether a cache line is shared or private, clean or dirty. The above method takes the way status into consideration, but does not account for factors that are not specific to the cache line, such as the ongoing status of the main memory.
There is thus a widely recognized need for, and it would be highly advantageous to have, a cache memory replacement strategy devoid of the above limitations.