The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for providing a recursive logical partition real memory map.
Contemporary technology enables economical fabrication of computer systems with generous complements of resources, including multiple processors, large primary fast memory, very large secondary storage, and many I/O devices. The concepts of virtualization and logical partitioning have been developed to efficiently use these systems for workloads of widely varying capacity and schedule demands. The total system resources are divided among a number of logical partitions, sometimes called virtual machines because each can operate autonomously as if it were a separate hardware system of smaller capacity. The number of partitions and the amount of resources assigned to each may vary widely and may be changed dynamically to match the needs of different independent workloads and to accommodate their time-varying demands. Usually, processors may be dynamically switched between partitions at millisecond intervals. However, primary memory and secondary storage may require longer intervals for reallocation between partitions, yet these longer intervals are still quite adequate to respond to daily and time-zone scheduling variations. Each partition usually runs a full software stack, such as an operating system, middleware, and related applications, as would run on an independent system.
The system component that manages logical partitioning is a combination of hardware and software referred to as a “hypervisor.” The hypervisor creates logical partitions, assigns resources to them, enforces resource separation and authorized sharing between the logical partitions, and dynamically alters resource assignments to the logical partitions in response to demands of the independent partition workloads and overall system performance goals. This core resource allocation function is necessarily the most privileged function in the overall system and is therefore part of the Reference Monitor component in systems that implement the Multi-Level Security models established by government and industry standards.
The historical approach to separation of real memory into partitions is called full virtualization of address translation and partition memory. The hypervisor is given exclusive control of the virtual address translation features of the system hardware, by running all logical partition (LPAR) software, including the OS, in a non-privileged state. Each LPAR OS is given an allocation of real memory which it may treat as a single block of apparent real addresses beginning at zero. The hypervisor keeps a real memory map that records which blocks of real memory are actually allocated to each LPAR. The LPAR OS controls the assignment of real memory pages to virtual addresses and stores these assignments in its page table, just as it would do if running on its own real hardware instead of in a LPAR (virtual machine) provided by the hypervisor.
However, the OS cannot install its page table for the hardware to use because the privileged operation to do this causes an interrupt to the hypervisor. When this occurs, the hypervisor remembers the address of the OS's page table and instead installs a hypervisor page table for the hardware. When a page fault occurs, the hypervisor receives the interrupt and looks in the OS's page table for a translation of the faulting virtual address. If one is found, the hypervisor uses its real memory map to translate the apparent real address from the OS's page table to an actual real address in a block of real memory allocated to the LPAR, stores this virtual-to-real translation in the hypervisor's page table used by the hardware, and resumes the page-faulting operation. If no translation is found in the OS's page table, the page fault interrupt is passed to the OS. After the OS assigns a real page to the virtual address in its page table, the above process is repeated to resolve the fault. If the OS needs to disable virtual address translation and directly address its apparent real memory, for example to receive an interrupt in some architectures, the hypervisor prevents this but instead installs another page table that translates LPAR apparent real addresses directly to actual real addresses allocated to the LPAR, thereby simulating real addressing mode.
Paravirtualization is an alternative to full virtualization that was developed to avoid some of the latter's overheads, such as simulation of privileged operations and real addressing mode, passing interrupts, maintaining multiple page tables, and sometimes needing multiple page faults to resolve one virtual translation. With paravirtualization, the OS runs in a mostly-privileged state and receives page fault interrupts directly from the hardware. A hardware register is provided to hold the actual real address of the one real memory block allocated to the LPAR for apparent real address zero. This is used when virtual address translation is disabled because the OS receives an interrupt, to avoid simulation of real addressing mode. The hypervisor runs in the most-privileged state and retains exclusive control of the page table used by the hardware, as with full virtualization. To resolve a page fault, the LPAR OS assigns an apparent real page to the faulting virtual address, just as it would for full virtualization or when running on its own real hardware. However, instead of storing this translation in its own page table, which the hardware cannot use, the LPAR OS calls the hypervisor, passing the virtual-to-real translation as a parameter. The hypervisor, using its real memory map, translates the apparent real address to an actual real address and stores the virtual-to-real translation in the hypervisor's page table used by the hardware, as with full virtualization. This avoids multiple tables and faults.
Although paravirtualization offers better performance than full virtualization, it requires significant OS changes and calls to the hypervisor to resolve page faults. In addition, the security of real memory separation of LPARs depends not only on the correctness of the hypervisor in allocating real memory to LPARs but also, for both para- and full virtualization, in maintaining the page table used by the hardware, in translating apparent real addresses to actual real addresses using its real memory map, and, for full virtualization, in interpreting the OS's page table.