Computer systems often employ several different memory devices that are accessible to the system microprocessor. As such, the system microprocessor typically includes one or more memory management functions for managing the various memory devices. One memory management function that is implemented within the Pentium AE Pro processor manufactured by Intel Corporation of Santa Clara, Calif., is known as paging. Paging provides a mechanism by which virtual memory addresses may be mapped into physical addresses corresponding to memory blocks, or "pages." A page of memory is set to be a fixed size, such as 4 kilobytes. Each of the pages may be stored in either a quick-access memory device, such as dynamic random access memory ("DRAM"), or on a slower-access mass storage device, such as a magnetic or optical disk.
FIG. 1 illustrates a block diagram of a prior art virtual-to-physical address translation. The virtual address 200 includes three fields that are used to translate the virtual address into a physical address within a page of memory. The directory field 202 is an index that points to an entry 211 within a page table directory 210. The page table directory entry 211 in turn points to a page table 220. Thus, there exists one page table for each entry within the page directory 210.
Once the appropriate page table 220 has been located, the table field 204 of the virtual address is used to index a particular entry 221 within the page table. This page table entry (PTE) 221 points to a page of physical memory 230. Thus, for every PTE within page table 220, there exists a page of physical memory. Using the PTE 221, the microprocessor checks to see if the page 230 is in system memory (e.g., DRAM). If not, the page is retrieved from the system disk and loaded into system memory.
Once the appropriate page of physical memory 230 has been loaded, the offset field 206 of the virtual address is used to index a particular address 231 within the page 230. Thus the physical memory address 231 is translated from the virtual address 200.
As can be appreciated from the above description, address translation may take a large number of bus cycles, degrading system performance. Thus, prior art computer systems improve performance by caching the most recently-accessed PTEs within a translation cache, or translation lookahead buffer (TLB).
FIG. 2 illustrates a block diagram of a virtual-to-physical address translation using a TLB 360. The directory field 302 of the virtual address 300 is used to look up a tag entry 311 within the TLB 360. The tag entry 311 is then compared with the table field 304 of the virtual address 300. If the tag entry 311 and the table field 304 match, the match signal 340 is asserted, indicating that the physical address translation may be performed using the TLB 360.
The physical address entry 321 and valid bit entry 331 are both associated with the tag entry 311 of the TLB 360. So long as the valid bit entry 331 indicates that the physical address 321 is valid, and there is a tag match, then the physical address 321 is used to point to a page of physical memory 350. Once the page 350 is loaded into system memory (if required), then the offset field 306 of the virtual address 300 is used to index the physical address 351 of the data within the page 350.
As was mentioned herein above, each entry of the TLB 360 includes a valid bit, e.g. valid bit 331. The valid bit 331 indicates whether or not the physical address 321 still points to the correct page of system memory 350. One situation in which the TLB entry would be invalid is where a PTE (e.g., entry 221 of FIG. 2) changes due to a modification by an operating system or software routine. In such a case, the physical address 321 within the TLB would no longer point to the correct page of memory.
One way in which an operating system or software routine may invalidate the TLB entry is by asserting the invalidate page (INVPLG) instruction, coupled with an argument that indicates the virtual address of the PTE that was changed. The INVPLG instruction is executed by first checking to see if a physical address stored in the TLB corresponds to the INVPLG argument. If found, the valid bit associated with the TLB entry is deasserted. Typically, the INVPLG instruction is a privileged instruction, such that only the most privileged software routines may assert this instruction.
For computer systems including more than one microprocessor, called "multiprocessor" systems, each microprocessor may include its own TLB. All of the microprocessors, however, may share the same physical memory. As such, the TLBs located within each of the microprocessors must be coherent with each other.
One prior art method of maintaining coherency among several caches is referred to as "snooping." Snooping is typically used to maintain coherency for data caches. Each microprocessor monitors the memory transactions performed by all of the other microprocessors, that is, it "snoops" on the other data caches to see if the memory transaction affected its cache data. While snooping is commonly used to maintain coherency in data caches, it is typically not employed for maintaining TLB coherency.
A common method of maintaining coherency among the TLBs is by performing a TLB "shootdown" operation whenever a page table entry is changed. The shootdown operation ensures that changes to a page table entry get propagated to the other microprocessors' TLBs.
One prior art way of performing a TLB shootdown operation starts with halting all microprocessors in the multiprocessor system. This maintains architectural consistency between all of the microprocessors during the shootdown operation. Once the microprocessors have been halted, a first microprocessor invalidates its own TLB by executing the INVPLG instruction. The first microprocessor then sends an interrupt to the other microprocessors. Upon receiving the interrupt, the other microprocessors invalidate their TLB entries using the INVPLG instruction. The first microprocessor waits for all of the microprocessors to complete the TLB invalidation before bringing them out of the halt state, such that they may continue executing programming instructions.
This prior art method of performing a TLB shootdown operation is time consuming, causing the microprocessors to halt operation for a relatively long time. For example, the software interrupt instruction ("INT"), accompanied with an interrupt vector ("n") is often used to communicate the shootdown to the other microprocessors. The INT instruction operates as a far call instruction. Upon receiving an interrupt instruction, the microprocessor uses the interrupt vector "n" to access a descriptor in an interrupt descriptor table (IDT). The descriptor is then used to access an interrupt gate. The interrupt gate then points to an interrupt handler routine that must be loaded into memory, and executed by the microprocessor. The use of descriptors, gates, and interrupt handlers is time consuming, and therefore degrades performance of the multiprocessor system.
It is therefore desirable to provide for a TLB shootdown operation that reduces an amount of time required to invalidate multiple TLBs. It is further desirable to provide a method of performing a TLB shootdown operation that maintains the consistency of an architectural state of the multiprocessor system while performing the shootdown operation in a reduced amount of time. Moreover, it is desirable to provide a method of performing a TLB shootdown operation without invoking interrupt handler routines.