In computer architectures using mass storage devices, such as disk drives, time delays in memory access are imposed by considerations such as disk revolution times. It has been a challenge for system designers to find ways to reduce these access delays. A commonly used technique has been to provide one or more regions of high speed random access memory, called caches. Portions of the contents of the mass storage are copied into the cache as required by the processor, modified, and written back to the mass storage. Caches continue to be one of the most pervasive structures found in computer systems. They are used at every layer of the memory hierarchy. Several levels are usually present in the processor. Primary memory management is traditionally based on a cache model, and files are cached in hierarchical storage management across secondary, tertiary, and networked storage.
While many variations have been developed over the years, the predominant principle in the management policies for these caches is a Least Recently Used replacement rule, typically applied uniformly across all pages of the cache. (For the purposes of the present invention, there are no limitations on the size of a page of a cache, or of the number of pages which can be in a cache.) If a memory request is directed to an address within a page of data which is not presently copied into the cache, then the page of data within the cache which was least recently used by the processor is copied back to the mass storage, and the page containing the newly requested address is copied from the mass storage into that portion of the cache. (Note that the least recently used page of data may be copied back to mass storage in advance of this new request, to improve performance by reducing access time.)
For the purpose of the discussion which follows, caches will be treated as fully associative. That is, for a given request to a given cache, the entire contents of the cache are searched for an exact match with the entire request. This is typical for database applications, in which the matching is done in software, and high speed is an important, but not critical, consideration.
By contrast, in a set associative cache arrangement, only a small region of the cache (the region is called a "set") is searched, for a match with the request (such as an address match). This latter arrangement is more commonly used in processor applications, involving hardware implementation and requiring greater access speed. See also Smith, "Cache Memories," Computing Surveys, vol. 14, no. 3, September 1982, pp. 473-530, ACM0010-4892/82/0900-0473, section 2.2, "Placement Algorithm."
Replacement algorithms are discussed in general in Smith, "Cache Memories," at section 2.4, "Replacement Algorithm". A formal definition of Least Recently Used is given in Coffman and Denning, "Operating Systems Theory," Englewood Cliffs, N.J.: Prentice-Hall, Inc. (1973), p. 245.
The operation of an LRU cache is as follows: Consider a finite size cache consisting of K page frames, buffers, or locations, with indexes denoted 1, 2, . . . K. If a request is made for page p, and the page p is found in position j, then the page p is moved to position 1 of the cache, the "most recently used position," and the pages formerly in positions 1, 2, . . . j-1 1 are pushed one position deeper into the cache. That is, the index of each of these pages is increased by 1. If the page p is not found in the cache, then it is brought from mass storage into the cache, and stored in position 1. All of the pages formerly in positions 1, 2, . . . K-1 are pushed one position deeper into the cache, to positions 2, 3, . . . K, respectively, and the page at position K is removed from the cache and returned to mass storage.
The event, in which the page p is requested and is found already to be in the cache, is called a hit. An event in which the page p is requested and not found in the cache, is called a miss. The ratio of page requests which are hits to total page requests is called the hit ratio. Similarly, the ratio of page requests which are misses to total page requests is called the miss ratio. Since the I/O rate to the level of storage below the cache in the hierarchy is directly proportional to the miss ratio, it is deskable to make the miss ratio as small as possible, or, conversely, to make the hit ratio as large as possible, in order to minimize the number of times that memory accesses are delayed because of the need to access mass storage.