In high performance data processing systems, high speed data and/or instruction caches are often provided to minimize delays between the processor and the memory system. Whenever an operand, either data or instruction, is accessed by the processor which is not resident in the respective cache, the cache manager will allow the memory system to provide the operand to the processor. In parallel, the cache manager will select a location in the cache into which the operand will be loaded when provided by the memory system. Subsequent accesses to the same operand can then be serviced by the cache without resorting to the memory system. After valid operands have been loaded into all of the available locations in the cache, any subsequent operand load requires the replacement of one of the other valid operands already resident in the cache. Various algorithms have been devised to select which of the "old" operands should be replaced by the "new" operand. In general, if hardware complexity is not a limiting factor, a true Least Recently Used (LRU) replacement algorithm, in which the location which was accessed by the processor the "least recently" is selected for replacement, is most efficient and therefore preferred. For example, true LRU replacement mechanisms were implemented in the IBM System/360 Model 68 and the CDC STAR-100.
On the other hand, if no hardware can be dedicated to this function, any of a number of replacement algorithms can be implemented in software, but only with a significant degradation of execution speed. One such scheme, referred to as "First-In-First-Out" (FIFO) replacement, maintains a pointer to each cache location in a "queue". When a given location is used, that is either loaded or accessed, the pointer to that location is moved to the tail of the queue. When a location is needed to load a new operand, the location indicated by the pointer at the head of the queue is selected. Unfortunately, in some circumstances, FIFO replacement can result in excessive replacement of frequently used operands.
A variation of FIFO, "First-In-Not-Used-First-Out" (FINUFO) replacement, associates with each location in the cache a "history" or "use" bit. Unlike FIFO, FINUFO assembles the pointers in a circular "loop" and maintains an "index" as an entry point into the loop. When a given location is used, the associated history bit is "set". Then, when a location is needed for loading a new operand, the history bit associated with the location indicated by the pointer at the index is examined. If the history bit is set, it is immediately "cleared". The index is then advanced to the next pointer in the loop, and the history bit associated with the respective location is then checked. This elimination process continues until the first location having a clear history bit is found and selected for loading. If all locations have the associated history bits set when the elimination process begins, the progressive clearing of the history bits assures that a location WILL be selected after the index has been advanced once around the full loop. However, a given location will rarely be selected for reuse as long as it is used at least once between each operand load. The FINUFO replacement algorithm was implemented, largely in software but with the history bits in hardware, in the MULTICS system for the GE 635/645 system, and in both the University of Michigan MultiProgramming Supervisor (MPS) and the original version of the CP-67 operating system for the IBM System/360 Model 67.
One disadvantage of the FINUFO algorithm is the reliance on software for loop management, severely restricting performance. Performance is further reduced since FINUFO cannot even begin to search for a location until a load is actually pending--otherwise, the status of the history bit(s) of the most recently used locations may not be correct.