Cache memory is used in computer systems in order to increase performance by alleviating the need for a processor to fetch data (“data” shall be used throughout to mean either computer instructions or operands upon which computer instructions operate) from main system memory sources, such as dynamic random-access memory (DRAM). DRAM and other main memory sources may require longer access times due to the paging and memory cell access speed of such memory sources, which can cause the processor to incur wait-states and degrade computer system performance.
Cache memory, on the other hand, provides the processor with a way to fetch data quickly without incurring the wait-states associated with main memory sources, such as DRAM. Using cache memory typically improves computer system performance by making commonly-used data available to the processor in a memory architecture that does not require paging cycles, that uses a relatively fast-access memory cell, and that places the cache in close proximity to the processor's local bus in order to reduce physical delay associated with bus structures.
The full performance benefits of using cache memory, however, can best be achieved by maintaining data within the cache memory that is most commonly used by the processor when executing a computer program. Therefore, data stored within a cache memory should constantly be monitored to determine when or if it should be replaced by data from a main memory source that is used more frequently.
Typically, cache memory is organized in “sets” or “ways” (hereafter collectively referred to as “ways”). A cache memory way typically comprises of a number of cache memory entry locations that have a common address. A set-associative cache is a type of cache memory that organizes data in cache ways that are assigned, or “mapped,” to a particular location within a main memory sources, such as DRAM. A cache memory way is re-mapped when data stored within that way is replaced by data from another location within main memory. Furthermore, cache ways may be mapped to the same main memory location in order to help maintain in cache the most current version of data associated with a particular main memory location.
Traditionally, the number of ways in a set-associative cache has been a power of two. For example, a 4-way set-associative cache memory contains four ways, which is equal to 22. Therefore, each cache memory way may be addressed by two bits. The cache memory's replacement policy may use a pseudo-LRU technique and a binary hierarchy encoding scheme, such as the one illustrated in FIG. 1.
FIG. 1 illustrates a cache memory encoding hierarchy comprises of nodes 105 and leafs 110. The leaves of the hierarchy represent individual cache ways. The nodes represent a bit of a vector representing a cache way. In particular, node values are either 0 or 1, which represent a path along the left or right branch from a node, respectively. In FIG. 1, circles, labeled with “L,” indicate bits of the encoding hierarchy, and squares, labeled with “W,” indicate cache ways.
When a way is accessed in a cache memory architecture, such as the that illustrated in FIG. 1, each node on the path traversed from the top of the hierarchy to the accessed way is updated, such that a bit indicates a non-followed branch of an accessed node. For example, if W2 is accessed, in FIG. 1, bits L0 and L2 are updated to 0 and 1, respectively. L1, in this example, is not updated and therefore retains an access “history” from a previous access. By updating nodes of the cache memory hierarchy in the manner described, a pseudo-LRU technique and binary hierarchy encoding scheme can be used to indicate which cache memory ways are least-recently used and are therefore candidates for replacement.
The encoding structure illustrated in FIG. 1, however, results in an uneven distribution of way replacement in cache memories that have a non-binary number of cache ways. The encoding structure illustrated in FIG. 1 applied to a cache memory with a non-binary number of cache ways is illustrated in FIG. 2.
The “unbalanced” hierarchy of the encoding structure illustrated in FIG. 2 may lead to an uneven distribution of cache way replacement when a pseudo-LRU cache way replacement technique is used to identify cache way replacement candidates. This is due to the fact that cache ways selected that are not part of a pair at the lowest level, such as W2205 and W5210, are replaced more often than those that are part of a pair. Uneven distribution of cache way replacement can result in cache ways being replaced that are not the least-recently used, thereby degrading performance of the computer system.