1. Field of the Invention
The present invention relates to cache memories in a computer system and more specifically to the transition lookaside buffers of the cache memories in such a computer system.
2. Art Background
In a computer system it is quite common for a central processing unit ("CPU") to have a cache memory to speed up memory access operations to main memory of the computer system. The cache memory is smaller, but much faster than main memory. It is placed operationally between the CPU and main memory. During the execution of a software program, the cache memory stores the most frequently utilized instructions and data. Whenever the processor needs to access information from main memory, the processor examines the cache first before accessing main memory. A cache miss occurs if the processor cannot find instructions or data in the cache memory and is required to access the slower main memory. Thus, the cache memory reduces the average memory access time of the CPU. For further information on cache memories, please refer to Computer Architecture: A Quantitative Approach, by John L. Hennessy and David A. Patterson, (Morgan, Kaufman Publishers, Inc., 1990).
In present day computing technology it is common to have a process executing only in main memory ("physical memory") while a programmer or user perceives a much larger memory which is allocated on an external disk ("virtual memory"). Virtual memory allows for very effective multi-programming and relieves the user of the unnecessarily tight constraint of main memory. To address the virtual memory, many processors contain a translator to translate virtual addresses in virtual memory to physical addresses in physical memory, and a translation lookaside buffer ("TLB"), which caches recently generated virtual-physical address pairs. The TLBs are essential because they allow faster access to main memory by skipping the mapping process when the translation pairs already exist. A TLB entry is like a cache entry where a tag holds portions of the virtual address and a data portion typically holds a physical page frame number, protection field, used bit and dirty bit.
Referring to FIG. 1, the operation of a cache begins with the arrival of a virtual address 100 and the appropriate control signals. The virtual address 100 is passed to both the TLB 110 and cache memory 120. The TLB 110 accepts a virtual page number 101 and uses it to select a set of elements, which is then searched associatively for a match of the virtual address 100. If a match is found, the corresponding physical address 121 is passed to the comparator to determine whether the data is in the cache 120.
If the TLB 110 does not contain the virtual-physical address pair needed for translation, an address translator is invoked. The address translator typically uses the high order bits of the virtual address 100 as an entry into the segment and page tables 105, which may be in either the cache or main memory, for the process and bit and then returns the address pair to the TLB 110, thus replacing an existing TLB entry.
Attached to each entry in the TLB, a valid bit indicates whether the entry is valid to ensure that the corresponding physical page has not been modified. Also, a use bit is set when an entry in the TLB is used to supply the mapping for virtual-physical addresses. When a new entry is stored in the TLB, the used bit is initially reset. During a TLB compare function, if the valid bit is reset, no compare takes place for that entry. If the valid bit is set, then a compare function can proceed. If the compare results a match, the use bit is set to indicate the virtual-physical mapping is used. TLBs are typically content-addressable memories ("CAM") and thus are usually four times as large as their random access memory ("RAM") counterpart. This is due to the built-in associative logic in the transistors, which operates a "compare" function in addition to a "hold" function.
When an entry is not found in the TLB, the TLB will be updated by replacing an entry with a newly generated virtual-physical mapping. There are two primary schemes employed for selecting an entry to replace. First, a random scheme is used to spread allocation uniformly by randomly selecting the potential entries in the TLB. Some systems use a scheme for spreading data across a set of blocks in a pseudo-randomized manner to get reproducible behavior, which is particularly useful during hardware debugging. Secondly, a least-recently-used ("LRU") scheme can be used to reduce the chance of throwing out information that will be needed soon. The entry replaced is the one that has been unused for the longest time. As such, this scheme makes use of a corollary of the principle of temporal locality: if recently used blocks are likely to be used again, then the best candidate for disposal is the least recently used.
A virtue of the random scheme is that it is simple to build in hardware. As the number of blocks to keep track of increases, LRU becomes increasingly expensive and is frequently only approximated.
As will be described, the present invention discloses a method and apparatus for a TLB with a built-in replacement scheme without complicated decoding logic.