The present invention relates to microprocessors, and, more particularly, to providing microprocessors with high performance caches.
Microprocessors have attained wide-spread use throughout many industries. A goal of any microprocessor is to process information quickly. One technique which is used to increase the speed with which the microprocessor processes information is to provide the microprocessor with an architecture which includes a fast local memory called a cache.
A cache is used by the microprocessor to temporarily store instructions and data. A cache which stores both instructions and data is referred to as a unified cache; a cache which stores only instructions is an instruction cache and a cache which stores only data is a data cache. Providing a microprocessor architecture with either a unified cache or an instruction cache and a data cache is a matter of design choice.
One microprocessor architecture that has gained wide-spread acceptance is the X86 architecture. This architecture, first introduced in the i386.TM. microprocessor, is also the basic architecture of both the i486.TM. microprocessor and the Pentium.TM. microprocessor, all available from the Intel corporation of Santa Clara, Calif. The X86 architecture provides for three distinct types of addresses, a logical (i.e., virtual) address, a linear address and a physical address.
The logical address represents an offset from a segment base address. The segment base address is accessed via a selector. More specifically, the selector, which is stored in a segment register, is an index which points to a location in a global descriptor table (GDT). The GDT location stores the linear address corresponding to the segment base address.
The translation between logical and linear addresses depends on whether the microprocessor is in Real Mode or Protected Mode. When the microprocessor is in Real Mode, then a segmentation unit shifts the selector left four bits and adds the result to the offset to form the linear address. When the microprocessor is in Protected Mode, then the segmentation unit adds the linear base address pointed to by the selector to the offset to provide the linear address.
The physical address is the address which appears on the address pins of the microprocessor and is used to physically address external memory. The physical address does not necessarily correspond to the linear address. If paging is not enabled then the 32-bit linear address corresponds to the physical address. If paging is enabled, then the linear address must be translated into the physical address. A paging unit, which is usually included as part of the microprocessor's memory management unit, performs this translation.
The paging unit uses two levels of tables to translate the linear address into a physical address. The first level table is a Page Directory and the second level table is a Page Table. The Page Directory includes a plurality of page directory entries; each entry includes the address of a Page Table and information about the Page Table. The upper 10 bits of the linear address (A22-A31) are used as an index to select a Page Directory Entry. The Page Table includes a plurality of Page Table entries; each Page Table entry includes a starting address of a page frame and statistical information about the page. Address bits A12-A21 of the linear address are used as an index to select one of the Page Table entries. The starting address of the page frame is concatenated with the lower 12 bits of the linear address to form the physical address.
Because accessing two levels of table for every memory operation substantially affects performance of the microprocessor, the memory management unit generally also includes a cache of the most recently accessed page table entries, this cache is called a translation lookaside buffer (TLB). The microprocessor only uses the paging unit when an entry is not in the TLB.
The first processor conforming to the X86 architecture which included a cache was the 486 processor, which included an 8 Kbyte unified cache. The Pentium.TM. processor includes separate 8 Kbyte instruction and data caches. The 486 processor cache and the Pentium.TM. processor caches are accessed via physical addresses; however, the functional units of these processors operate with logical addresses. Accordingly, when the functional units require access to these caches, the logical address must be converted to a linear address and then to a physical address.
In microprocessor architectures other than the X86 architecture, it is known to use virtually addressed caches to eliminate the address translation time from a cache hit. However, because input output devices (I/O) use physical addresses, mapping is required for the I/O to interact with the cache. In these systems, there are generally only two levels of addressing, virtual and physical, and thus only a single translation is required for the physically addressed I/O devices to interact with the virtually addressed cache. Additionally, with a virtually addressed cache, every time a process is switched, the virtual addresses refer to different physical addresses, and thus, the cache must be flushed as the virtually addressed cache entries are potentially invalid. Additionally, with a virtually addressed cache, it is possible for two different virtual addresses to correspond to the same physical address. These duplicate addresses are called aliases and could result in two locations in a virtual cache having information from the same physical address, the information in only one of the locations being modified.