Set Associative Organization and Replacement Algorithms
Data processing systems use relatively small but quickly accessible caching devices to hold frequently referenced information, reducing the time required to access such information when it is found in the caching devices. Since the caching devices are small compared with system memory and, therefore, can hold only a subset of all referenced data, the effectiveness of the devices is determined by the management policies, such as replacement, partitioning, and locking policies. Some example caching devices are instruction and data caches and TLBs found in computer systems and texture map caches used in graphics systems.
There is numerous prior art related to the implementation of replacement policies. Some are listed here in no particular order. The U.S. Pat. No. 4,334,289 describes using one bit to denote relative age, or reference order, between each pair of elements in a set. The U.S. Pat. No. 4,783,735 describes using content address memory and relative age information for each element in a set to implement the LRU replacement policy. The U.S. Pat. Nos. 5,140,690, 5,325,511 and 5,845,320 describe using six bits per set to implement LRU replacement policy for 4-way set associative cache. The U.S. Pat. No. 5,717,916 describes fully associative cache implementation using a pointer in each cache location to point to the next cache location, an LRU pointer and a MRU (most recently used) pointer. The U.S. Pat. No. 6,098,152 describes not MRU (most recently used) replacement policy. The U.S. Pat. No. 6,205,519 describes managing shared cache in multithreaded processors.
Caching devices are generally organized as set associative, as shown in FIG. 1, having M number of sets with N elements in each set. Such a device is said to use N-way set associative organization and provides N places within a set for an element to reside in the caching device. A direct mapped organization can be viewed as a degenerative case of set associative organization, in which there is only one way in each set. That is, N is 1. A fully associative organization is the most general form of set associative organization, in which there is only one set containing all elements. That is, M is 1.
When the caching device is accessed with a reference address, a portion of the address—known as the reference index—identifies one of the M sets. The index is used to read the N number of cache tags, as well as other cache management information, associated with the identified set. The tags are compared with another portion of the reference address—known as the reference tag—to determine if the data associated with the reference address resides in the caching device. If the reference tag matches one of the N cache tags, the condition known as a hit, the data associated with the reference address is in the caching device. In addition, some cache management information, such as relative reference order among the elements in the set, may be updated.
Empirical studies show that a set associative organization with a larger set size generally offers higher hit rates, and therefore better performance, than one with a smaller set size. For instance, a 4-way set associative organization offers higher hit rates than a direct mapped organization because it can keep up to four elements that are mapped to the same set while the direct mapped organization can keep only one of the four at any given time.
Hit rates are also subject to the replacement policy used to select one of the elements present in cache to evict to make room for a new element in a set, since each set can hold a finite number of elements at any given time. The random replacement policy randomly selects one way and evicts the element in it, requiring little hardware to implement. The round-robin replacement policy selects each way in a set in turn and evicts the element happened to be in the selected way. The FIFO (first-in-first-out) policy selects the way whose element that has been in the set the longest to evict. The LRU replacement policy selects the way whose element that has not been referenced for the longest to evict. Empirical studies also show that the LRU policy offers higher hit rates than other policies. Implementing the true LRU algorithm is difficult, however, since it requires knowing the relative reference order of all elements in each set. Accordingly, what is needed is a replacement mechanism for set associative caches. The present invention addresses such a need.