The present application relates generally to an improved data processing apparatus and method and more specifically to a logical partition memory.
Contemporary technology enables economical fabrication of computer systems with generous complements of resources, including multiple processors, large primary fast memory, very large secondary storage, and many I/O devices. The concepts of virtualization and logical partitioning have been developed to efficiently use these systems for workloads of widely varying capacity and schedule demands. The total system resources are divided among a number of logical partitions, sometimes called virtual machines because each can operate autonomously as if it were a separate hardware system of smaller capacity. The number of partitions and the amount of resources assigned to each may vary widely and be changed dynamically, to match needs of different, independent workloads and to accommodate their time-varying demands. Usually, processors may be dynamically switched between partitions at millisecond intervals while primary memory and secondary storage may require longer intervals for reallocation between partitions, still quite adequate to respond to daily and time-zone scheduling variations. Each partition usually runs a full software stack, such as operating system, middleware, and related applications, that would run on an independent system.
The system component that manages logical partitioning is a combination of hardware and software called the hypervisor. It creates logical partitions, assigns resources to them, enforces resource separation and authorized sharing between them, and dynamically alters resource assignments to them, in response to demands of the independent partition workloads and overall system performance goals. This core resource allocation function is necessarily the most privileged function in the overall system and is therefore part of the Reference Monitor component in systems that implement the Multi-Level Security models established by government and industry standards.
The historical approach to separation of real memory into partitions is called full virtualization of address translation and partition memory. The hypervisor is given exclusive control of the virtual address translation features of the system hardware, by running all partition software, including the Operating System (OS), in a non-privileged state. Each partition OS is given an allocation of real memory, which it may treat as a single block of apparent real addresses beginning at zero. The hypervisor keeps a real memory map that records which blocks of real memory are actually allocated to each partition. The partition OS controls the assignment of real memory pages to virtual addresses and stores these assignments in its page table, just as it would do if running on its own real hardware instead of in a logical partition (virtual machine) provided by the hypervisor. However, the OS cannot install its page table for the hardware to use because the privileged operation to do this causes interrupts to the hypervisor.
When this occurs, the hypervisor remembers the address of the OS's page table and instead installs a hypervisor page table for the hardware. When a page fault occurs, the hypervisor receives the interrupt and looks in the OS's page table for a translation of the faulting virtual address. If one is found, the hypervisor uses its real memory map to translate the apparent real address from the OS's page table to an actual real address in a block of real memory allocated to the partition, stores this virtual-to-real translation in the hypervisor page table used by the hardware, and resumes the page-faulting operation. If no translation is found in the OS's page table, a page fault interrupt is passed to the OS. After the OS assigns a real page to the virtual address in its page table, the above process is repeated to resolve the fault. If the OS needs to disable virtual address translation and directly address its apparent real memory, for example in some architectures to receive an interrupt, the hypervisor prevents this but instead installs another page table that translates partition apparent real addresses directly to actual real addresses allocated to the partition, thereby simulating real addressing mode.
Paravirtualization is an alternative to full virtualization that was developed to avoid some of the latter's overhead costs due to, for example, simulation of privileged operations and real addressing mode, passing interrupts, maintaining multiple page tables, and sometimes needing multiple page faults to resolve one virtual translation. With paravirtualization, the OS runs in a mostly-privileged state and receives page fault interrupts directly from the hardware. A hardware register is provided to hold the actual real address of the one real memory block allocated to the partition for apparent real address zero. This is used when virtual address translation is disabled because the OS receives an interrupt, to avoid simulation of real addressing mode.
The hypervisor runs in the most-privileged state and retains exclusive control of the page table used by the hardware, as with full virtualization. To resolve a page fault, the partition OS assigns an apparent real page to the faulting virtual address, just as it would for full virtualization or when running on its own real hardware. However, instead of storing this translation in its own page table, which the hardware can't use, it calls the hypervisor, passing the virtual-to-real translation as a parameter. The hypervisor, using its real memory map, translates the apparent real address to an actual real address and stores the virtual-to-real translation in the hypervisor page table used by the hardware, as with full virtualization. This avoids multiple tables and faults.
Although paravirtualization offers better performance than full virtualization, it requires significant OS changes and calls to the hypervisor to resolve page faults. In addition, the security of real memory separation of partitions depends on the correctness of the hypervisor in allocating real memory to partitions. The security of real memory separation of partitions further depends on the hypervisor correctly translating partition apparent real addresses to actual real addresses and correctly maintaining the resulting virtual-to-real translations in the hypervisor page tables used by the hardware. Moreover, for full virtualization, the security of real memory separation of partitions depends on the hypervisor correctly interpreting the OS's page tables.