1. Field of the Invention
This invention relates to computer systems, and in particular, to a two-level cache memory system having separate primary instruction and data caches and a single secondary cache containing both instructions and data.
2. Description of the Prior Art
In computer systems it is desirable to obtain information for use by the central processor as rapidly as the central processor can execute its instructions. Typically, however, the central processing unit is capable of executing instructions much faster than those instructions and the necessary data may be retrieved from the main memory. As a result, cache memories have been developed for storing frequently used information.
Most current computer systems use virtual memory to provide protection, large address spaces and convenient allocation of physical memory. In virtual memory systems, addresses must-be translated at some point between the processor and main memory. While most aspects of cache design are unrelated to virtual memory, whether translation occurs before or after the cache access is a crucial issue and one which is quite visible to the memory management software. If fast cache access were the only consideration, then a virtual cache would be attractive because it could be accessed without waiting for the translation. There are, however, several reasons why physical cache systems are simpler to build. Two of the most important are virtual address synonyms and the desire to build cache coherent multiprocessors. Certain architectures have eliminated the synonym problem by definition, but a constraint in many architectures is that synonyms are allowed both within a single address space and between multiple address spaces.
Virtual memory is usually implemented by page tables which translate every virtual page number into the appropriate physical page number. Because the page tables are large and because the format of a page table entry should not be tied to a particular hardware implementation, a common practice is to copy information from the page tables into a translation-lookaside buffer that is referenced by the hardware. In most respects, the TLB is just a special cache for the page tables.
A cache memory may contain instructions, data, or both, but the cache does not occupy a fixed set of addresses in the memory address space. Instead, the cache contains a duplicate of what is stored at a selected set of the main memory addresses. The entries in caches are termed lines, and each line contains an information field (instruction or data) and a tag field. A line can contain an arbitrary number of words. In operation, when the processor requires an access to memory, it checks the tag fields in the cache to determine if the requested address is present there. If the address is present in the cache (termed a hit), the access is performed on the cache rather than on the main memory. Because of the higher speed of the cache, the processor does not need to wait for the information to be retrieved from the main memory and may proceed on with executing its instruction stream. On the other hand, if the information sought is not present in the cache (a miss), then the information must be retrieved from main memory.
There are several well known types of caches. The majority of computers use physical indexes and physical tags in their cache systems. In such architectures, the cache is located after a translation lookaside buffer (TLB) has translated a virtual address into a physical address. A portion of this physical address is used as an index to point to a physical address tag and the corresponding data. Computing systems employing this approach include the IBM 3090 series, the DEC VAX 11/780, and the MIPS RC3260. Unfortunately, such a physical index, physical tag cache in undesirably slow and the TLB undesirably large.
Another well known cache system is a virtual cache in which virtual indexes and virtual tags are employed. Sun Microsystems' 3/200 computer systems employ such a cache. Such caches, however, have difficulty dealing with circumstances in which the virtual addresses for two separate programs map to a single physical address. These two virtual addresses must be managed by the system to prevent their simultaneous presence in the cache.
A third well known cache system employs a virtual index and a physical tag. Such systems include the ELXSI 6400 system. In these systems the TLB operates in parallel with the data access to generate the physical address for comparison to the tag to determine if there is a hit. A significant disadvantage of these cache architectures is the space required for the TLB. Because a TLB read is required for every access, a large TLB is necessary. This makes incorporation of such a system on a single chip difficult.
A few two level cache architectures have been developed. One such system is the Silicon Graphics 4D-MP workstation is described in "The 4D-MP Graphics Superworkstation: Computing+Graphics=40 MIPS+40 MFLOPS and 100,000 Lighted Polygons per Second," by Forest Baskett, Tom Jermoluk and Doug Solomon, 33rd IEEE Computer Society International Conference (Spring, 1988) IEEE Catalog No. 88CH2539-5, pp. 468-471. The system described there incorporates a MIPS R2000 CPU and R2010 FPU with a single instruction cache and a first and second data cache separated by a write buffer. The instruction cache is fed by a read buffer, while the data caches drive a write buffer. The first level data cache is always a subset of the second level data cache to maintain consistent data. In addition, all of the caches--both instruction and data--employ only physical addresses.
A direct-mapped cache is a cache in which a word from the main memory can be stored at only one place in the cache. Typically, for a physically indexed cache, the cache is indexed by the low order bits of the main memory address. In such caches the tag field of the selected line stores the high order bits of the address. Thus, when access to a particular main memory address is requested, the cache hardware employs the low order bits as an index to select the corresponding line in the cache, and then compares the tag for that line with the remaining address bits. If a hit occurs, it accesses the selected byte in the line. If a miss occurs, the requested address is not in the cache and the selected line is replaced with new information from the main memory.
A cache is termed set-associative when a line from memory can be placed in more than one place in the cache; in other words, when the cache provides different frames where memory addresses with identical low order address bits may be placed. For example, a two-way set-associative cache allows two memory addresses with identical low order bits to be stored in the cache at the same time. In such a circumstance, when access to a particular main memory address is requested, the cache hardware uses the low order address bits to select the set of lines in the cache. The cache then compares all of the tags in the set (in this case two) with the high order address bits, and if a hit occurs, it accesses the selected byte in the corresponding information field. If a miss occurs, the requested address is not in the cache and one of the lines in the cache must be replaced, typically using a least recently used (LRU) based algorithm.
Cache memories must also employ a write policy to maintain consistent information in the cache and main memory. A write-back cache is one in which at the time information is written into the cache, the cache bytes are marked as being modified. Before replacing a line in the cache which contains modified bytes, the cache must write the modified bytes back to the main memory. In a write-through cache, when information in the cache is modified, it is immediately also written into the main memory.