The performance of a main memory can often be improved by providing the main memory with a cache. A cache is a type of memory, which is typically both faster and smaller than the main memory. When a data item is requested from a memory address of the main memory, i.e., in a read request, the data item, which is returned by the main memory, is also stored in the cache in a data element. Associated with the stored data item, the cache also stores a representation of the memory address corresponding to the data item. When, later, a request is made for the same memory address, the cache can return the data element without the need of consulting the main memory. Since the cache memory is faster than the main memory, this will improve the perceived response time of the memory system, formed by the combination of main memory and the cache.
One may also take advantage of a cache to improve the performance of a write request. Typically, when a request is made to the memory to store a data element at the memory address, i.e., a write request, the write request is intercepted by the cache. The cache stores the data element, together with the memory address. If later a read request or write request for the same memory address is made, the cache can respond without consulting the main memory. Again, the perceived response time for a read request is decreased, since the read-request can be handled from the cache without the need of consulting the main memory. Moreover, a cache decreases the number of read and write requests made on the main memory, as typically multiple requests are combined by the cache into a single large request, e.g., a request for an entire cache line. A modern main memory is more efficient in handling a request for an entire cache line in a burst, than single individual requests.
A cache is usually organized in a number of cache lines. Each cache line stores a number of data elements corresponding to a same number of main memory addresses. Typically, the main memory addresses are logically consecutive. Each time such a cache accesses the main memory for reading, a complete cache line is read, and consequently stored in the cache. Sometimes, this may result in the cache requesting data elements from the main memory that are not yet needed, i.e., which have not yet been requested.
Some basic operations on cache lines can be distinguished, typical in the design of a data cache.    fetch transfer data from the memory to the cache    invalidate mark the cache space as free, i.e., available for use; after invalidation, it does not contain valid data    copy-back transfer the data from the cache to the memory    pre-fetch transfer a cache line from memory to the cache before any part of the cache line is currently required by the processor to avoid stalling of the processor when the data is required
Typical status information which may be associated with a cache line includes:    in-use indication that the cache line contains valid data, set after fetch or after a write operation    dirty bit indication that the cache line is modified compared to the contents of the main memory, typically set at a write operation    dirty mask a mask indicating which data elements in the cache line are modified compared to the main memory
A problem with using the cache for write requests is that the main memory and the cache at some point in time store different data elements for the same memory address. A cache line is called ‘dirty’, if the cache line stores at least one data element and associated memory address such that the data element in main memory at the same associated memory address is different from the one in the cache. A dirty cache line may be ‘cleaned’ by copying the content of the cache line to main memory, thereby resolving the differences between the cache line and the corresponding content of the main memory.
Data is said, ‘to be cached’ when a copy of the data from the main memory resides in the cache, or when data in the cache is intended for later storage in the main memory.
An algorithm that determines the transfer of the data content of a cache line to main memory is called a write policy. A write policy typically decides which cache lines to write back to memory, and when to do so.
During use, caches may be ‘full’. For example, after reading a large number of data elements from main memory, at some point all memory of the cache will be filled with cached copies of the large number of data elements. At that point no new data elements can be stored in the cache. To resolve this, at some point a cache line is reassigned to new main memory data elements. A cache line, which is so selected, is called a ‘victim’, or ‘is victimized’. If a dirty cache line is victimized, it must, typically, first be cleaned before the cache line can be re-used for a new memory line. If the content of the main memory corresponding to a cache line is, e.g., as a result of the write-back, identical to the content of the cache line, then the data content of the cache-line can be discarded without losing the data. Such a cache line may be re-used immediately for caching of new data. Instead of immediate re-use, a cache line may also be marked as ‘free’, so that it is available later.
For efficiency reasons, when a free cache-line is needed, only a limited part of the cache may be searched for such a free cache-line. A cache with such a limited search strategy, is also called full if the limited search strategy cannot find a free cache-line among the limited part of the cache. Such a limited search strategy is, for example, applied in a so-called direct mapped cache, and also in an N-way set associative cache, such as a 2-way set associative cache. In an N-way set associative cache, the cache will only search through N cache lines before deciding if a cache line is available. If no cache line is available among those N cache lines, the cache is considered full, at least for the current operation.
An algorithm that determines when to victimize which cache lines is called a replacement policy or replacement strategy. A typical replacement policy is the ‘least-recently-used policy’, the LRU policy. The LRU policy selects the cache line for victimization which was used least recently. Some replacement policies aim to keep some pre-determined number of cache-lines free.
A write policy and a replacement policy will both be referred to as a ‘cache management policy’.
The situation wherein a read or write request is made, but the corresponding data element is not in the cache, is called a ‘cache miss’. If many cache misses occur the efficiency of the memory system is reduced. The write policy and the replacement policy are an important factor in the performance contribution of the cache. How well a cache management policy works depends on the access pattern to the main memory. For example, if the replacement policy victimizes a line, which is thereafter requested, a cache miss occurs. Other factors in the performance contribution of the cache include: the size of the cache, the size of the cache lines, the memory access pattern of an application using the cache, etc.
U.S. Pat. No. 6,327,643 describes a cache connected to a memory and a method of replacing a cache line in the cache with a new line from memory. The memory comprises multiple pages, each page comprising multiple banks. At any time one of the pages is ‘current’. When a cache line must be replaced it is first determined if there exists cache lines which are not dirty. If so, the one which was least recently used is replaced. If all cache lines are dirty it is determined if there exists a cache line which corresponds to a part of the current page, and if so the one which was least recently used is replaced. Finally, if all cache lines are dirty and do not correspond to a part of the current page, then the oldest cache line is replaced.