An increasing number of data-intensive distributed applications are being developed to serve various needs, such as processing very large data sets that generally cannot be handled by a single computer. Instead, clusters of computers are employed to distribute various tasks, such as organizing and accessing the data and performing related operations with respect to the data. Various applications and frameworks have been developed to interact with such large data sets, including Hive, HBase, Hadoop, Amazon S3, and CloudStore, among others.
At the same time, virtualization techniques have gained popularity and are now common place in data centers and other environments in which it is useful to increase the efficiency with which computing resources are used. In a virtualized environment, one or more virtual machines are instantiated on an underlying computer (or another virtual machine) and share the resources of the underlying computer. However, deploying data-intensive distributed applications across clusters of virtual machines has generally proven impractical due to the latency associated with feeding large data sets to the applications.
Overview
Provided herein are systems, methods, and software for implementing accelerated data input and output with respect to virtualized environments. Data requested by a guest element running in a virtual machine is delivered to the guest element by way of a region in host memory that is mapped to a region in guest memory associated with the guest element. In this manner, the delivery of data from a source to where a guest element, such as a virtualized data-intensive distributed application, is accelerated.
In at least one implementation, a computing system passes a process identifier to a kernel driver for a host environment, wherein the process identifier identifies a guest process spawned in a virtual machine and wherein the kernel driver uses the process identifier to determine an allocation of host memory corresponding to guest memory for the guest process and returns the allocation of host memory. Additionally, the computing system performs a mapping of the allocation of host memory to an allocation of guest memory for the guest element.
This Overview is provided to introduce a selection of concepts in a simplified form that are further described below in the Technical Disclosure. It should be understood that this Overview is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.