This invention relates to systems, apparatuses and methods employing and implementing cache memories. More specifically, this invention relates to systems, apparatuses and methods employing and implementing set-associative cache memories having replacement policies that support locking.
Cache memories generally comprise part of a memory system; the memory system in turn typically comprises part of a computing system, such as a personal computer or a television (TV) set-top box. The computing system further comprises a processing device. In the computing system, the memory system stores information which the processing device accesses in read and write operations. When accessing information, the processing device typically requires the information to be available on an essentially immediate basis. If the information is delayed, the processing device's operation can stop (a "stall"). Stalls range, in the user's experience, from unnoticeable to quite noticeable degradations of the computing system's performance. Moreover, a stall can cause the computing system to fail, such as where the processing device's unstalled performance is the foundation of a running application. Accordingly, the computing system's proper operation is a function not only of the processing device's performance, but also of the memory system's performance.
Ideal memory systems satisfy three properties: infinite speed, infinite capacity and low cost. Memory systems should be infinitely fast so that, for example, the processing device's operation is unhindered by information availability. Memory systems should be infinitely capacious so that, for example, they can-provide any and all information that the processing device may need in its operations associated with applications. Memory systems should be inexpensive, for example, so as to minimize their cost penalty respecting the computing device.
As a general principle, however, no single memory technology can satisfy, at once, each of these properties. Rather, memory technologies typically can satisfy any two of these properties, while violating the remaining property. Responding to this constraint, memory systems generally are structured as a hierarchy. Hierarchical memory systems combine technologies, generally in physically distinct levels, so as to balance among speed, capacity and expense at each level and toward achieving, overall, both acceptable performance and economy.
At the lowest level of a hierarchical memory system typically are the registers of the system's processing device. These registers are limited in number, are extremely fast and are disposed physically adjacent to the logic blocks of the processing device (e.g., the arithmetic logic unit). However, their location makes the registers expensive relative to other memory technologies.
In the hierarchy's next higher level is the cache memory. The cache memory may itself occupy levels, including a first level that is resident as part of the processing device's integrated circuit ("on-chip"), and a second level that is not on-chip but may be inside the processing device's package or otherwise closely coupled to such device.
Also in the hierarchy are other, increasingly higher levels of memory. These levels typically include (i) a main memory, generally comprising volatile memory technology (e.g., random access memory in any of its forms) and (ii) more-permanent storage (e.g., compact disk, floppy, hard, and tape drives).
The cache memory generally is implemented, relative to higher levels of memory, using fast technologies. These technologies tend to be relatively expensive on a per-bit basis. However, because the cache memory typically is small in capacity, its overall cost remains acceptable in the computing system.
The cache memory's fast operation typically is buttressed by physically-close coupling to the processing device. In keeping with its speed and coupling, the cache memory generally is implemented so as to hold the information that the processing device is deemed most likely to access in the immediate future. In that regard, the processing device will first seek access to particular information via the cache memory because, if the information (e.g., data, instructions, or both) is found in the cache memory (a cache "hit"), the information can be provided at great speed to the device. If the information is not found in the cache memory (a cache "miss"), the processing device accesses the information via one of the next, higher levels of the memory system. These next-level accesses typical engender, relative to a hit, increasingly larger delays in the information's availability (the "miss penalty") to the processing device. Moreover, these next-level accesses also typically trigger an update of the cache memory's contents, generally by duplicating or copying information from the main memory into the cache memory. Both cache misses and the attendant updating procedures tend to stall the processing device.
Accordingly, cache memory is engineered in the computing system, typically via a combination of hardware and software, not only to store duplicate copies of information, but also to continually manage the cache memory's contents, all so that cache misses and attendant updates are minimized. To do so, cache memories typically exploit the principle of locality of reference. This principle holds that certain information has a higher probability of being accessed in the immediate future than other information. This principle has two component properties: spatial locality and temporal locality. Spatial locality holds that accessing information having a specific memory address makes probable a near-term access of information in neighboring memory addresses. Accordingly, spatial locality is generally used to determine the scope of information that is brought into the cache. In an update, for example, spatial locality directs duplicating in the cache memory both accessed and neighboring information.
Temporal locality holds that an access to certain information makes probable a near-term re-access to that information. Temporal locality generally is used to determine a replacement policy, i.e. to determine what, if any, information is replaced in a cache update. For example, a miss generally triggers an update procedure that will replace information corresponding, in size, to the amount of information being stored in the update. One replacement policy is to replace information which, as of the update, was the least recently used: such information being deemed the least likely to be used in the near-term and, therefore, replaceable.
Although it comports with the principle of locality, the Lease Recently Used (LRU) replacement policy's unfettered operation may be undesirable. For example, certain information may not be recently used, but nevertheless should be retained in the cache memory because it is deemed critical to the processing device's operation and/or the proper overall function of the computing system.
One approach to retaining critical information is known as cache locking. Cache locking arrangements have been proposed, but generally have shortcomings. For example, the arrangements tend to lock the cache memory in fixed, undesirably large increments (e.g., large "granularity"). Large-granularity locking arrangements are undesirable because they tend to retain information in the cache memory that is neither critical nor satisfying of the principal of locality. Retaining that information wastes the cache memory's capacity. This waste effectively reduces the size of the cache memory which reduction, in turn, generally degrades the performance of the cache memory (a cache memory's hit frequency tends to decrease as the cache memory size decreases).
In addition, proposed locking arrangements tend to use replacement policies that do not comport with the principle of locality. Using such replacement policies tends to degrade the performance of the cache memory. For example, such replacement policies tend to increase the probability for replacement of unlocked, active information, particularly as the amount of locked information increases. Replacement of active information tends to increase the frequency of cache misses.
Accordingly, it is desirable to provide cache locking that precludes replacement of critical, inactive information while also providing a replacement policy that comports with the principle of locality. It is also desirable to enable locking of critical information in a cache memory using an optimally-sized granularity while also enforcing the principle of locality in the replacement policy applicable to unlocked information.