1. Field of the Invention
The invention relates to translation lookaside buffers for a computer system using virtual addressing, and more particularly to means for preventing damage to the translation lookaside buffer which might occur when a virtual address is found in more than one TLB entry.
2. Description of Related Art
LSI CPU chips or chip sets which use virtual addressing schemes require a page table for conversion of virtual addresses (VA) generated by the CPU to real addresses (RA) (also called physical addresses (PA)) usable by external devices (such as main memory or peripherals). The page table may be located in main memory or in separate storage, and may be made accessible to the hardware, the operating system, or both. In order to speed up the address conversion process, CPUs frequently include a translation lookaside buffer (TLB), which is a small cache memory which stores the several most recently used virtual addresses and their corresponding real addresses.
A general description of cache memories may be found in Strecker, "Cache Memories for PDP-11 Family computers," in Bell, Computer Engineering (Digital Press), at 263-67. As can be seen, caches can be organized in several alternative ways. A direct mapped cache comprises a high speed data RAM and a parallel high speed tag RAM. The location address of each entry in the cache is the same as the low order portion of the main memory address to which the entry corresponds, the high order portion of the main memory address being stored in the tag RAM Thus, if main memory is thought of as 2.sup.m blocks of 2.sup.n words each, the i'th word in the cache data vector will be a copy of the i'th word of one of the 2.sup.m blocks in main memory. The identity of that is stored in the i'th location in the g vector. When the CPU requests data from memory, the low order portion of the address is supplied as an address to both the cache data and tag vectors. The tag for the selected cache entry is compared with the high order portion of the CPU's address and, if it matches, the data from the cache data vector is enabled onto the output bus. If the tag does not match the high order portion of the CPU's address, then the data is fetched from main memory. It is also placed in the cache for potential future use, overwriting the previous entry.
In the context of a TLB, the "main memory" being accessed is the page table; the "data" in the data vector is a real or physical address, and the "address" supplied by the CPU is a virtual address. Thus, for a direct mapped TLB, the low order portion of the virtual address. is supplied as an address to both the TLB data vector (also called the real address (RA) or physical address (PA) vector), and the TLB tag vector. The tag in the selected TLB entry is compared with the high order portion of the virtual address from the CPU, and if it matches, the physical address in the PA vector is enabled onto the output bus for further use within the computer. If it does not match, the physical address is obtained from the full page table.
Another cache organization usable in the TLB context is called "two way set associative." In this organization, a second pair of tag and data vectors (tag and PA vectors) are placed alongside the first pair and accessed in parallel therewith. Thus, when the CPU provides a virtual address, the corresponding physical address may be found in either or both of the pairs. The determination can be made serially by a single match comparator, by comparing the high order portion of the virtual address to each of the two tags in sequence; or it can be made in parallel by two match comparators, each of which compares the high order portion of the virtual address to one of the two tags. In either case, if one of the tags matches, the corresponding physical address is enabled onto the output bus for further use within the computer. If neither matches, then the physical address is fetched from the page table. If both match, which should not occur in the normal operation of the computer, then some means is used to select one or the other and/or an error condition is signaled. The concept of set associativity can be extended to cover any number of tag/data (tag/PA) pairs, a type of organization referred to generically as "n-way set associativity."
Yet another cache organization usable in the TLB context is called "fully associative." This type of organization employs a single tag/data (tag/PA) vector pair, but the location of the tag/data (tag/PA) information in the vectors no longer has any correspondence with its location in main memory (page table). Rather, the information may be found in any location in the vector pair. No portion of the address from the CPU is used as an address to the vector pair; instead, the entire address is compared to the tags in the vector. As with n-way set associative caches, the comparison may be performed serially or in parallel (or by some combination of those two methods). If a match is found with one tag, the corresponding information in the data (PA) vector is enabled onto the output bus for further use within the system. If no match is found, the data (PA) is obtained from main memory (or the full page table). If more than one tag matches, which, again, should not ordinarily occur, then some means is used to select the data (PA) corresponding to one of the matching tags and/or an error condition is signaled.
The invention relates specifically to cache and TLB organizations in which a given address can be found in more than one location in the cache or TLB, and more specifically to those organizations in which the match comparison is performed, at least in part, in parallel. Systems using these organizations run the risk that more than one match-comparator operating in parallel will detect a match, and thereby enable more than one word of data((more than one physical address) onto the same output bus. If it happens that the different words of data (physical addresses) contain different information, excessive current flow could be created through the conflicting output transistors. This can cause, at best, loss of data, and at worst, physical damage to the chip.
One solution to this problem might be to add logic between the parallel match comparator outputs and the enable inputs to ensure that only one word of data (only one physical address) is ever enabled onto the output bus at one time. This additional layer of logic adds unwanted delay, especially if it operates by a ripple effect.
Another solution, disclosed in U.S. Pat. No. 4,473,878 to Zolnowsky, might be to prevent the storage of conflicting data initially. This solution does not reduce delay because it merely moves the comparison step to the data storage portion of the cycle. Additionally, it does not handle the situation existing on power-up, in which the data in any memory is random.
Another solution to this problem might be merely to have the software ensure that conflicting data is never stored in the cache or TLB. This is undesirable, however, because it requires every system programmer to be aware of the risk and expend effort and time avoiding it. There is also the possibility that the software will contain errors which have not been detected prior to execution. Moreover, the software cannot control the contents of the cache on power-up, in which the cache is typically filled with random data.
In U.S. Pat. No. 4,357,656 to Saltz, there is described a scheme for disabling part or all of a cache memory for the purpose of diagnostics on the cache. It comprises an ordinary direct mapped cache memory, with the addition of a cache control logic. Under microcode control, the cache control logic can be put into any of four modes: disable entire cache, disable none of the cache, disable top half, or disable bottom half. When diagnostics are to be performed, the cache control logic is first put into the appropriate mode. The cache control logic then compares each memory access address in conjunction with the mode stored therein, and forces, if appropriate, a "miss" condition regardless of the output of the match comparator.