Microprocessors perform computational tasks in a wide variety of applications, including portable electronic devices. Maximizing processor performance can be desirable, to permit additional functions and features to be implemented in portable electronic devices and other applications.
Due to the spatial and temporal locality characteristics of common in computer programs, the instructions and data being processed at any given time are statistically likely to be needed in the very near future, and may be retained in a high-speed, cache memory, where they are quickly available. Memory access instructions present a portion of a (virtual or physical) address, known as a tag, to a CAM structure in the cache, where it is compared against the corresponding address bits of instructions and/or data stored in the cache's RAM (the CAM and RAM contents together are referred to herein as a cache entry). If the tag matches, or “hits,” a cache line is returned that contains the desired instructions and/or data (or, in the case of set-associative caches, a plurality of cache lines are returned, and one is selected by a different portion of the address known as the index). If the tag fails to match any CAM entry, it “misses,” and the memory access instruction proceeds to access the desired instruction or data from main memory.
It is commonly desirable to maximize the overall cache hit rate, thus minimizing off-chip accesses to main memory, which can incur latency, stall the pipeline, and consume additional power. Additionally, in some applications, a few critical or often used instructions and/or data may be known, and this information may be maintained in the cache, regardless of the cache's overall performance. Some instruction set architectures provide the means to “lock” a cache entry, so that the entry is not replaced during normal cache miss processing. In some cases, the selection of cache entries to lock is explicit and precise, so as to minimize the impact of locked entries on normal cache allocation algorithms.
FIG. 1 is a functional block diagram depicting one representative means of locking cache entries. The cache 1, which may comprise an instruction cache, a data cache, or a unified cache, includes n entries, numbered from 0 to n−1. A FLOOR register 2 holds the entry number that represents the “floor” of the cache, or the lowest cache entry available for normal allocation. Cache entries below the floor are not available for replacement, and are hence “locked.” If no entries are locked, the FLOOR register contains a 0, and the cache replacement algorithm operates throughout the cache. If, as depicted in FIG. 1, the bottom three entries are locked, the processor will increment the FLOOR register to three, the first cache entry available for reallocation. The normal cache reallocation algorithm in this case operates in the portion of the cache from the “floor,” or three, to the top of the cache, n−1.
Grouping the locked cache entries in one place simplifies the replacement algorithm. For example, if cache entries are replaced on a round-robin basis, only the “rollover” point is affected by the locked entries (i.e., when incrementing past n−1, the next entry is that pointed to by the FLOOR register 2 rather than 0). There are no non-contiguous, locked entries scattered across the cache space that must be “skipped over” by a round-robin allocation. Note that the FLOOR method of grouping and locking cache entries is representative only, and is not limiting. Cache entries may be grouped together and locked against reallocation according to a broad variety of methods.
Interrupts are, generally generated by events outside of the processor, and may be non-determinate in nature. Thus, interrupts may occur during the execution of code that attempts to carefully arrange locked cache entries. The interrupt-handling code may include memory access instructions, that are likely to miss in the cache, causing accesses to main memory. These memory accesses will normally generate allocations in the cache. That is, instructions and data fetched to service the interrupt will replace some cache lines. If the interrupts occur after locked cache entry processing has begun, but before the locked entries are established and arranged, cache entries that were meant to be locked may be reallocated. Additionally, non-locked cache entries may be allocated in an area intended for locked entries, such as below the FLOOR register. This may result in non-contiguous locked entries, imposing a significant burden on some cache replacement algorithms, such as round-robin.