Technical Field
Embodiments described herein relate to processors and more particularly, to managing a unified shared virtual address space between multiple processors.
Description of the Related Art
Modern central processing units (CPUs) have a memory management unit (MMU) which consists of a page-based address translation table (i.e., page table) in memory which typically has multiple levels and supports variable page sizes. The CPU maintains a cache of recent address translations (i.e., translation lookaside buffer or “TLB”) which is used for instruction and data references. The MMU enables a process running on the CPU to have a view of memory that is linear and contiguous (i.e., the “virtual address space”) while the actual memory locations can be sparsely scattered in real memory (i.e., the “physical address space”).
When a process running on the CPU references a virtual address that is not found in the TLB, the process stalls while the CPU looks for a valid translation in the page table. This is called a “table walk” and is usually done by hardware, though some architectures perform the table walk in software. If the referenced virtual address does not have a valid translation in the page table (e.g., the Present (P) Bit is not set in the lowest level page table entry on an x86 CPU), an exception is raised which activates a software handler that has the option to correct the problem and retry the faulting instruction.
The input/output MMU (IOMMU) performs virtual address translation for direct memory access (DMA) by peripheral devices. When a translation is not found in the IOMMU TLB, the IOMMU performs a table walk. If a page fault is detected, the faulting request is aborted. The IOMMU has no mechanism to activate a software handler which could correct the problem, nor is there a mechanism to signal a peripheral to retry a faulting request. In systems with multiple CPUs executing multiple operating system (OS) instances, typically each CPU manages its own view of the system's virtual address space. This results in redundant software and hardware being utilized in each of the multiple CPUs, taking up software resources and valuable space that could otherwise be utilized to perform other functions.