Embodiments of the present invention relate generally to data processing, and more particularly to cache management operations, particularly in a virtualized environment.
Conventional operating systems (OS) that operate in a system typically assume that the OS has complete and direct control of hardware and system resources. The OS implements the policies to manage these resources to allow multiple user-level applications to be run. Virtualization allows multiple instances of OSs to be run on a system. The OSs can be the same or different versions, and can come from different OS vendors. In a typical virtualized environment, certain system software is responsible for virtualizing the hardware and system resources to allow multiple instances of the OSs (referred to herein as “guest OSs”) to be run. The software component that provides such functionality is referred to herein as a virtual machine monitor (VMM). The VMM is typically host software that is aware of the hardware architecture of the system.
For each instance of a guest OS, the VMM creates and presents a virtual machine (VM) to the guest OS. From the perspective of a guest OS, the VM includes all the hardware and system resources (e.g., processors, memory, disks, network devices, etc.) expected by the guest OS. From the VMM perspective, these hardware and system resources are thus “virtualized”.
Virtualized environments include fully-virtualized environments and para-virtualized environments. In a fully-virtualized environment, each guest OS operates as if its underlying VM is an independent physical processing system that the guest OS supports. Accordingly, the guest OS expects the VM to behave according to the architecture specification of the supported physical processing system. In contrast, in a para-virtualized environment, the guest OS helps the VMM to provide a virtualized environment. Accordingly, the guest OS may be characterized as virtualization aware. For instance, a para-virtualized guest OS may be able to operate only in conjunction with a particular VMM, while a guest OS for a fully-virtualized environment may operate on different types of VMMs.
In a fully-virtualized environment, one or more device models may be present, which are software running in the VMM, service domain, or even in the guest itself that perform driver-type operations in the fully-virtualized environment. Such device models may be present in a user-level application, a guest OS or a hypervisor, such as a VMM. In a para-virtualized environment, such driver-type operations may be implemented using a virtual device driver service running in the service domain such as a back-end driver. This too may be software that can be located in a user application, guest OS or VMM. A service domain that performs driver-type operations is called a driver domain.
As an example of a virtualized environment, a VMM may create a first VM that presents two logical processors to one guest OS, and a second VM that presents one logical processor to another guest OS. The actual underlying hardware, however, may includes less than, equal to, or greater than three physical processors. The logical processors presented to a guest OS are called virtualized processors. Likewise, VMs may include virtualized storage, peripherals, and the like.
A VMM may use emulation to perform certain operations on behalf of guest software, which includes both guest OSs, as well as applications running on top of the OSs. For instance, guest software may seek to perform a direct memory access (DMA) operation to access data from a memory directly without processor involvement. However, since a DMA controller is a virtualized resource, the VMM or driver software running on top of the VMM may emulate the DMA operation.
Because such DMA operations occur without help of a processor, the system needs to maintain coherency between an instruction cache and a data cache. To do this, signals on a bus, such as so-called snoop cycles are used to invalidate cacheable pages that a DMA access modifies. Thus in a VMM environment, a guest OS that issues DMA read operations expects to see instruction and data caches to be synchronized when the operation is completed as in a native system operation. However, when performed by emulation, such DMA read operations cache data into a data cache but not into an instruction cache.
These DMA read operations cache data only in a data cache because a device model or background virtual device driver service that performs the DMA operation via emulation typically executes physical DMA operations with an internal buffer, and then copies the buffer to a location provided by the guest.
Problems result if the guest OS seeks to use a DMA operation to load an executable image and execute the code after the DMA operation. Thus, instruction and data caches are typically synchronized each time a guest-initiated DMA operation is completed. However this results in redundant operations and bus cycles. As a result, performance is degraded since many such DMA operations are for data only and thus coherency between instruction and data caches is not needed.