An increasing number of data-intensive distributed applications are being developed to serve various needs, such as processing very large data sets that generally cannot be handled by a single computer. Instead, clusters of computers are employed to distribute various tasks or jobs, such as organizing and accessing the data and performing related operations with respect to the data. Various applications and frameworks have been developed to interact with such large data sets, including Hive, HBase, Hadoop, Amazon S3, and CloudStore, among others.
At the same time, virtualization techniques have gained popularity and are now commonplace in data centers and other environments in which it is useful to increase the efficiency with which computing resources are used. In a virtualized environment, one or more virtual machines are instantiated on an underlying computer (or another virtual machine) and share the resources of the underlying computer. These virtual machines may employ the applications and frameworks that typically reside on real machines to more efficiently process large data sets.
Overview
Provided herein are systems, methods, and software for associating cache memory to a work process. In one example, a method of operating a support process within a computing system for providing accelerated input and output for a work process includes monitoring for a file mapping attempt initiated by the work process. The method further includes, in response to the file mapping attempt, identifying a first region in memory already allocated to a cache service, and associating the first region in memory with the work process.
In another instance, computer apparatus to provide accelerated input and output with respect to a work process includes processing instructions that direct a support process of a computing system to monitor for a file mapping attempt initiated by the work process. The processing instructions further direct the support process to, in response to the file mapping attempt, identify a first region in memory already allocated to a cache service, and associate the first region in memory with the work process. The computer apparatus further includes one or more non-transitory computer readable media that store the processing instructions.
In a further example, a node computing system for providing accelerated input and output with respect to a work process includes the work process configured to initiate a file mapping attempt. The node computing system further provides a support process configured to identify the file mapping attempt. The support process is further configured to, in response to the file mapping attempt, identify a first region in memory already allocated to the cache service, and associate the first region in memory already allocated to the cache service with the work process.