The development of computer systems has allowed increasing complexity and size of software applications, which in turn placed higher demands on the performance of the computer systems. Consequently, techniques have been developed to expand the performance capacity of computer systems. For example, a computer system may include more than one central processing unit (“CPU”) to increase the computer system's processing power. FIG. 1 illustrates a single computer system 100. The computer system 100 includes a main memory 102, and one or more CPUs such as the representative CPU A 104A and CPU Z 104Z. The main memory 102 contains code and data necessary for the operation of the one or more CPUs in the computer system 100. The main memory 102 is only addressable by physical addresses.
The increasing demands placed on a computer system also have increased the amount of addressable memory available on a computer system. Increasing the amount of addressable memory enables a computer system to operate on more information and more complex software programs. One technique for increasing the addressable memory of a computer system is to provide a virtual memory system. A virtual memory system simulates an addressable memory that can be much larger than the main memory that a computer system actually has. A CPU produces virtual addresses that are translated, via hardware and software, to physical addresses, which can be used to access the main memory via a process of memory mapping.
The translation from a virtual address to a physical address is usually implemented by a memory management unit (“MMU”). As shown in FIG. 1, each CPU in the computer system 100 interfaces with the main memory 102 via a MMU. For example, the representative CPU A 104A communicates with the main memory 102 via a MMU A 106A. The MMU A 106A is capable of translating a virtual address issued by the CPU A 104A into a corresponding physical address, which is then used to access the original requested memory location in the main memory 102.
A MMU may translate a virtual address into a physical address by referencing data structures, such as translation tables, stored in main memory. FIG. 2 is a block diagram illustrating an exemplary conventional implementation of translating a virtual address 202 into its corresponding physical address by a MMU. FIG. 2 is discussed with reference to the computer system 100 illustrated in FIG. 1. Upon receiving the virtual address 202, a MMU in the computer system 100, such as the MMU A 106A, first references a page directory 204 in the main memory 102. A page directory 204 contains multiple entries, each of which maps to a page table. For example, an entry 208 in the page directory 204 maps to a page table 206. A page table contains the physical address information required to translate a virtual address to its corresponding physical address. Often, available physical memory is divided into a plurality of pages. Each entry of a page table is mapped with an individual physical page. For example, the page table entry (“PTE”) 210 in the page table 206 points to a physical page 212 in the main memory 102.
Consequently, to translate a virtual address 202 into its corresponding physical address, a MMU needs to have at least one reading of the page directory 204 and one reading of the page table 206. In some implementations, more layers of page directories and page tables may be provided. As a result, translating a virtual address into its corresponding physical address may consume a significant amount of system clock time as well as bus bandwidth; and therefore, may cause undesirable delay for the operation of a CPU.
Thus, alternatively, a MMU uses cache-like memories such as translation buffers (“TB”) to speed up the process of translating a virtual address into a physical address. For example, as shown in FIG. 1, each MMU in the computer system 100 contains a TB. For instance, the MMU A 106A includes a TB A 108A.
A TB is a cache that keeps track of recently used address mappings between a virtual address and a physical address. As shown in FIG. 3, a MMU can translate a virtual address 202 into its corresponding physical address 302 by first looking into a TB 304. If the virtual address 202 has been used recently, then the TB 304 probably holds information concerning the mapping between the virtual address 202 and its corresponding physical address 302. If the TB 304 does hold the needed information, the TB 304 can quickly provide the physical address 302 for the virtual address 202, thus eliminating the need for the MMU to spend several clock cycles accessing the page directory 204 and the page table 206 in the main memory 102. This occurrence is usually referred to as “TB hit.” On the other hand, when the TB 304 cannot provide a physical address 302 for the virtual address 202, a “TB miss” occurs. The MMU then needs to access the page directory 204 and the page table 206 in the main memory 102, as illustrated in FIG. 2, for the purpose of updating the TB 304. Such a process is usually referred to as a “TB fetch.” Referring back to FIG. 2, the PTE 210 also contains a V bit 214 and an A bit 216. The V bit (“Valid bit”) 214 indicates whether the mapping in the PTE 210 is valid. The A bit (“Access bit”) 216 indicates whether the PTE 210 has been accessed by, for example, a MMU in the computer system 100. At times, the mapping between a virtual address 202 and its corresponding physical page address 212 may change. For example, the PTE 210 may no longer be mapped to the physical address page 212. In such a case, the Valid bit 214 is cleared, signaling that the mapping between the virtual address 202 and the physical page address 212 is no longer valid. At times, the access right on the physical page address 212 is changed, for example, from a read-and-write permission to a read-only permission; and/or the virtual-physical address mapping changes, i.e., the virtual address 202 may be mapped to another physical page address in the main memory 102. In these two situations, the Valid bit 214 of the PTE 210 remains set.
Conventionally, when the PTE 210 becomes invalid, which occurs when, for example, its Valid bit 214 is cleared or its page permission and/or virtual-physical mapping changes, the operating system in the computer system 100 instructs the CPU(s) to invalidate any TB entry that caches the virtual-physical address mapping in the PTE 210, so to avoid page permission mismatches, stale virtual-physical address mappings, or the like. Such invalidation is usually referred to as a “TB invalidate” or “TB flush.” A TB flush is time-consuming and expensive, especially when multiple CPUs and, hence, multiple TBs, exist in a computer system.
Therefore, there exists a need for reducing the number of TB flushes that an operating system needs to issue in order to purge invalid virtual-physical address mapping information from TBs in a computer system.