Field of the Invention
The present invention relates to replacement policies and methods of using direct-mapped caches.
Background of the Related Art
Computers continue to get faster and more efficient to meet a heavy demand for processing many different types of tasks. Cache memory makes a limited amount of data rapidly accessible to a processor. To facilitate the access, the cache memory may be physically closer to the processor than main memory. In fact, a processor (CPU) cache (or L1 cache) may be physically located on the same chip as the processor and may be dedicated to a single processor core on a multi-core chip.
Data is transferred between main memory and cache in blocks of fixed size, called cache lines. When a cache line is copied from memory into the cache, a cache entry is created. The cache entry will include the copied data as well as the requested memory location.
When the processor needs to read or write a location in main memory, the processor first checks for a corresponding entry in the cache. The cache checks for the contents of the requested memory location in any cache lines that might contain that address. If the processor finds that the memory location is in the cache, a cache hit has occurred. However, if the processor does not find the memory location in the cache, a cache miss has occurred. In the case of a cache hit, the processor immediately reads or writes the data in the cache line. For a cache miss, the cache allocates a new entry and copies in the requested data from main memory, then the request is fulfilled from the contents of the cache. The proportion of accesses that result in a cache hit is known as the hit rate, and can be a measure of the effectiveness of the cache for a given program or algorithm.
Read misses delay processor execution because the processor must wait for the requested data to be transferred from memory, which is much slower than reading from the cache. Write misses may occur without such delay, since the processor can continue execution while data is copied to main memory in the background.
In order to make room for the new entry on a cache miss, the cache may have to evict one of the existing entries. The heuristic that is used to choose the entry to evict and the conditions of the eviction is called the replacement policy. The fundamental problem with any replacement policy is that it must predict what data will be requested in the future.
FIG. 1 is a diagram illustrating the use of a direct-mapped cache. Each main memory block can be mapped to only one location in a direct-mapped cache. This mapping limitation has the advantage of making it quick and easy to determine whether a cache hit has occurred. The disadvantage is that two or more active main memory segments or blocks may map to the same direct-mapped cache entry which will produce a low cache hit rate and lower performance.
Furthermore, a portion of the main memory address is used to directly map to a cache entry. In this example, a main memory address has four bits labeled A, B, C and D, wherein bits C and D (serving as an index) are used to determine the proper cache entry or line. The other main memory address bits A and B (serving as a tag) are stored in the cache directory so that it is known which main memory block is stored in the cache line. When a main memory block is referenced, the cache is checked to see if it holds that block. This is done by using address bits C and D to determine which cache line to check and by using address bits A and B to see if they match what is stored in the directory. If they match, then this is a cache hit and the memory reference can be satisfied by the cache line which is faster than accessing main memory. Accordingly, a CPU may read from, or write to, the referenced memory block in the cache line. If the memory reference is mapped to a line that does not have a matching tag (bits A and B), then the request must be fulfilled by the slower main memory. When a cache miss occurs in a conventional direct-mapped cache, the current cache entry is replaced by the requested main memory block.
FIG. 2 is a diagram of a conventional direct-mapped cache illustrating how a direct-mapped cache is organized, and how the cache functions to distinguish between a cache hit and a cache miss. Each memory address may be considered to having three parts. The lowest address bits are called the byte offset. Since caches work with cache lines that are typically 64 or 128 bytes, there is no need to address anything smaller than the cache line. Therefore, the address bits used to select a particular byte in a cache line are not used in the cache itself. These address bits are the byte offset.
The next portion of the memory address is the Index. The address bits of the index are used to determine the particular cache line being addressed. The rest of the memory address is referred to as the tag. The address bits of the tag are stored in the cache directory and used to keep track of the address of the block that is stored in a cache line. A comparator is used to compare the tag of the memory address to the one stored in the cache directory to determine whether there is a cache hit or miss.
Along with the tag, each entry or line of the cache directory may have three more fields. The valid bit (VB) indicates whether or not the line is valid or not. An invalid line always produces a cache miss. The modified bit (MB) indicates whether or not this entry has been modified and may differ from main memory. If an entry has been modified, it will need to be written back to main memory before it is replaced. Some caching algorithms do not allow modified entries, so this field is not present in all direct-mapped cache implementations. The final field is the data which is a cache line portion of main memory. The purpose of the cache is to hold frequently used portions of main memory so that the processor can access it faster than it could from main memory.