Instructions and data used by a computing device are stored at physical addresses in one or more primary or secondary memory devices. Primary memory device, such as system memory, graphics memory and the like, is characterized by quick access times but stores a limited amount of data. Secondary memory devices, such as magnetic disk drives, optical disk drives and the like, can store large amounts of data, but have relatively longer access times as compared to the primary memory devices.
Generally, instructions and data are stored in pages in the one or more secondary memory devices. As pages are needed by a given application, they can be moved into one or more primary memory devices. Pages that are no longer needed by the application can be moved from the primary memory device back to the secondary memory device to make room for other pages that are needed by a given application. When pages are moved from secondary to primary memory or moved from primary memory back to secondary memory, their physical addresses change. However, it is undesirable and inefficient for applications running on a computing device to keep track of these changing physical addresses.
Accordingly, the applications utilize virtual addressing to access instructions and data. Virtual addressing provides a separation between the physical memory and the virtual addresses that an application utilized to load or store data and instructions. Processes running inside a virtual memory space do not have to move data between physical memory devices, and do not have to allocate or reallocate portion of the fixed amount of system level memory between them. Instead, a memory management unit (MMU) and/or the operating system (OS) keeps track of the physical location of each piece of data, and moves data between physical locations to improve performance and/or ensure reliability.
Referring to FIG. 1, an exemplary address translation data structure utilized to translate virtual addresses 110 to physical addresses 120 is illustrated. The address translation data structure may include a page table data structure 130 and a translation lookaside buffer (TLB) 140. The page table data structure 130 may include a page directory 150 and one or more page tables 160-190. The page directory 150 includes a plurality of page directory entries (PDE). Each PDE includes the address of a corresponding page table 160-190. Each PDE may also include one or more parameters. Each page table 160-190 includes one or more page table entries (PTE). Each PTE includes a corresponding physical address of data and/or instructions in primary or secondary memory. Each PTE may also include one or more parameters.
Upon receiving a virtual address, the TLB 140 is accessed to determine if a mapping between the virtual address 110 and the physical address 120 has been cached. If a valid mapping has been cached (e.g., TLB hit), the physical address 120 is output from the TLB 140. If a valid mapping is not cached in the TLB, the page table data structure is walked to translate the virtual address 110 to a physical address 120. More specifically, the virtual address 110 may include a page director index, a page table index, and a byte index. The page directory index in the virtual address 110 is used to index the page directory 150 to obtain the address of an appropriate page table 170. The page table index in the virtual address 110 is used to index the appropriate page table specified in the given PDE to obtain the physical address 120 of the page containing the data. The byte index in the virtual address 110 is then used to index the physical page to access the actual data. The resulting mapping is then typically cached in the TLB 140 for use in translating subsequent memory access requests. Furthermore, as a page moves from secondary memory to primary memory or from primary memory back to secondary memory, the corresponding PTE in the page table data structure 130 and TLB 140 is updated.
Generally, the PTE can also store additional attributes associated with memory accesses. An exemplary page table 140 that stores a plurality of PTEs is shown in FIG. 2. Each PTE in the page table 140 includes a page frame address 120 and one or more attributes 220. The attributes 220 may include a dirty bit, an accessed bit, a page check disable bit, page write transparent bit, a user accessible bit, a writeable bit, a present bit, a hash function identification bit, a valid bit, an address compare bit, a referenced bit, a changed bit, storage control bits, a no execute bit, page protection bits and/or the like. The attributes 220 can be used by the MMU and/or OS to manage the data in the primary and secondary memories and access thereto.
Referring now to FIG. 3, an exemplary memory subsystem according to the conventional art is shown. The memory subsystem includes a memory management unit 305 communicatively coupled to a computing device readable medium (e.g., primary memory), such as random access memory (RAM) 310. The memory 310 is adapted to store at least a portion of one or more address translation data structures 315, and data and instructions 320. In one implementation, the address translation data structure includes a page directory and one or more page tables 325 A given page table 325 may include a plurality (X) of PTEs.
The memory management unit 305 includes a paging module 320 and a cache 335. The paging module 330 is adapted to manage caching of page table entries 325′ and translation of virtual address to physical addresses. In particular the paging module 330 caches one or more address translation mappings to service memory access requests. Each mapping includes a previously utilized page table entry and is stored as part of a translation lookaside buffer (TLB) 325′. Because the cache 335 in the memory management unit 305 is relatively small, the paging module 330 swaps mappings in an out of the cache 335 in accordance any conventional replacement algorithm. Accordingly, there are various tradeoffs between the size of the cache 335, latency resulting from having to swap page table entries between the cache 335 and memory 310, and communication traffic generated between the cache 335 and the memory 310.