Continued advances in semiconductor fabrication technologies (Moore's Law) and the engineering of systems on a chip (SoC) have resulted in the widespread development of multicore processor chips which are attractive in the theoretical performance/power metric and reduced system cost. The number of processor cores on the chip range from 2 to greater than 100 depending on the intended application, the size chip, the size of the individual cores and the amount of on-chip memory and integrated devices. The processor cores may be identical (homogeneous multicore) or different (heterogeneous multicore) and they may perform specialized data processing (data plane processing) or perform general purpose processing (control plane processing). Of particular interest here are multicore chips for embedded systems that establish periodic data flows for macro-pipelined data plane processing. Such data flow may be relatively static and may be between data plane processing nodes on either specialized signal processing cores or general purpose cores. The same chip typically also has more dynamic control plane processing that is performed on one or more general purpose cores.
In practice, a general problem with multicore processing chips is the difficulty of getting the aggregate multicore processing performance to scale with the number of cores, i.e., getting the chip performance to double when the number of cores on the chip is doubled. Even for processing tasks that are easily performed in parallel, as the number of cores is increased the incremental performance improvement may be disappointing due to processor cores competing for access to shared resources such as memory and input/output peripheral devices. Memory management units and the related peripheral memory management units address this resource sharing problem.
The software technology of embedded hypervisor virtualization is attractive for multicore processing chips in embedded systems as it provides a versatile hardware abstraction layer that supports isolated virtual computing environments and systematic resource sharing. Embedded hypervisor software executes directly on top of the hardware and virtualizes the chip's processor cores, the system memory and the peripheral devices. Hypervisors® generally facilitate the creation of multiple isolated virtual machine (VM) environments or partitions, each of which may support: 1) an operating system (OS) executing one or multiple applications on one or more cores; or 2) applications that execute without an OS.
Hypervisors® for desktop/server applications may be based on a full or nearly-full featured operating system and may contain more than a 1000 times as much code as a hypervisor for an embedded system. Examples of the desktop/server hypervisors include: VMware®'s ESX having a 2 gigabytes code base and a full Linux® OS; VMware®'s smaller ESXi having a 150 megabyte code base, without full Linux®, Citrix Systems®' Xen; Microsoft®'s Hyper-V®; and Red Hat®'s Linux KVM. These desktop/server hypervisors are typically designed for full virtualization in which there is no modification to a guest OS.
The relatively large desktop/server hypervisors often have sophisticated, yet somewhat indirect, memory management. For example, the memory management methods of the VMware® ESX and ESXi hypervisors are described in the company publications “Hypervisor Memory Management Done Right” and “Understanding Memory Resource Management in VMware® ESX 4.1”. In one method, the ESX memory management performs background searching for identical memory pages that can be transparently shared between different VM and applications software elements. In another method, the ESX memory management involves installing a driver into the guest OS that implements a memory management trick called ‘ballooning’. The balloon driver lets the hypervisor find out which virtual memory pages an isolated guest OS has freed up so that the hypervisor can free up the corresponding physical memory pages. From these examples, it is clear that new direct or indirect memory management methods are of interest to virtualization software companies like VMware®.
In comparison, for (the more) memory-constrained embedded system applications, the hypervisors are typically designed to be minimal in terms of lines of code and have a small memory footprint that is only several tens to several hundred kilobytes. Examples of embedded system hypervisors include Red Bend®'s VLX, Open Kernel Lab®'s OKL4 Microvisor, and the hypervisor from the Codezero® community. These hypervisors for embedded systems are typically designed for para-virtualization in which the guest OS is modified to support OS-hypervisor application interfaces (APIs).
The relatively small hypervisors for memory-constrained embedded systems tend to have more basic memory management and may benefit the most from hardware virtualization support. Intel®, AMD®, Power Architecture® and ARM® either have introduced or are in the process of introducing hardware accelerators into the processor that trap and execute sensitive/privileged instructions that have previously been processed with hypervisor software. For example, the ARM® 2011 white paper “Virtualization is coming to a Platform near You” describes the ARM® virtualization support to be available in 2012. As discussed in the Intel® 2011 white paper, “The Benefits of Virtualization for Embedded Systems” several hypervisors that take advantage of the Intel virtualization technology (Intel VT) are currently available from Wind River®, Green Hills Software®, LynuxWorks®, Real Time Systems® and TenAsys®. For memory management, virtualization hardware support may be provided for the shadowed translation and paging tables as well as the virtual-to-intermediate physical address (VA to IPA) translation tables and IPA to physical address (IPA to PA) translation tables that are the primary elements of memory management in hypervisor virtualized systems. Additional new methods of efficiently managing memory in these memory-constrained embedded systems are desired to work alongside existing memory management elements so that the virtualized multicore processing performance may be improved.