High-performance, cloud, and enterprise computing environments increasingly use virtualization to improve the utilization efficiency of their computing, storage, and network resources. Applications that execute within virtual machines (VMs) often tend to be data-intensive and latency-sensitive in nature. The I/O and memory requirements of such VMs can easily exceed the limited resources allocated to them by the virtualization layer. Common examples of resource-intensive VM workloads include large database processing, data mining, scientific applications, virtual private servers, and backend support for websites. Often, I/O operations can become a bottleneck due to frequent access to massive disk-resident datasets, paging activity, flash crowds, or competing VMs on the same node. Even though demanding VM workloads are here to stay as integral parts of cluster-infrastructures, state-of-the-art virtualization technology is inadequately prepared to handle their requirements.
Data-intensive and large memory workloads can particularly suffer in virtualized environments due to multiple software layers of indirection before physical disk is accessed. To handle large memory workloads, developers often implement domain-specific out-of-core computation techniques [1] that juggle I/O and computation. But these techniques do not overcome one fundamental limitation—the VM's working set cannot exceed the memory within a single physical machine. Further, many recent large-scale web applications, such as social networks [2], online shopping, and search engines, exhibit little spatial locality, where servicing each user request requires access to disparate parts of massive datasets. To compound the problem further, each user query could be processed by multiple tiers of software, adding to the cumulative I/O latency at each tier. For instance, Amazon [3] processes hundreds of internal requests to produce a single HTML page. Simply buying specialized large-memory machines [4] is also not viable in the long-term because cost-per-gigabyte of DRAM tends to increase non-linearly, making these machines prohibitively expensive to both acquire and maintain. Thus low-latency, and possibly locality-independent, I/O to massive datasets is proving to be a critical requirement for new class of cluster-based applications.
Distributed shared memory (DSM) systems [17], [18] allow distributed parallel applications running on a set of independent nodes to share common data across a cluster. DSM systems often employ heavyweight consistency, coherence, and synchronization mechanisms and may require distributed applications to be written against customized APIs and libraries. In similar spirit, memcached [7] provides an API to access a large distributed in-memory key-value store. In non-virtualized settings, the use of memory from other machines to support large memory workloads has been explored [19], [20], [21], [22], [23], [8], [24]. However, these systems did not address the comprehensive design and performance considerations in using cluster-wide memory for virtual machine workloads. Additionally, early systems were not widely adopted, presumably due to smaller network bandwidths and higher latencies at that time. A recent position paper [25] also advocates the treatment of cluster memory as a massive low-latency storage, but with focus on developing new APIs for applications.
See, e.g., the following references, each of which is expressly incorporated herein by reference in their entirety: U.S. Pat. Nos. 8,131,814; 2008/0082696; 2009/0157995; U.S. Pat. Nos. 8,095,771; 8,046,425; 7,925,711; 7,917,599; 6,167,490; 6,298,419; 6,766,313; 6,886,080; 7,188,145; 7,320,035; 7,386,673; 7,536,462; 2002/0016840; 2005/0039180; 2006/0184673; 2006/0190243; 2007/0288530; 2009/0012932; 2009/0070337; 2009/0144388; 2009/0150511; 7,631,016; 7,617,218; 7,437,426; 6,640,285; 6,516,342; 4,240,143; 4,253,146; 4,843,541; 5,095,420; 5,159,667; 5,218,677; 5,237,668; 5,442,802; 5,592,625; 5,918,249; 6,026,474; 6,044,438; 6,148,377; 6,185,655; 8,127,225; 8,127,086; 8,122,348; 8,112,715; 8,108,768; 8,108,373; 8,103,125; 8,095,556; 8,090,688; 8,088,011; 8,082,400; 8,073,994; 8,069,317; 8,069,154; 8,046,557; 8,046,425; 8,037,038; 8,032,610; 8,027,960; 8,015,281; 8,010,896; 7,970,761; 7,958,440; 7,953,687; 7,937,447; 7,930,371; 7,925,711; 7,917,599; 7,849,164; 7,805,706; 7,783,932; 7,765,176; 7,761,737; 7,761,625; 7,761,624; 7,734,998; 7,734,890; 7,721,292; 7,702,875; 7,668,923; 7,610,523; 7,610,345; 7,430,639; 7,424,577; 7,369,912; 7,356,679; 7,321,992; 7,318,076; 7,260,831; 7,254,837; 7,249,208; 7,237,268; 7,194,589; 7,149,855; 7,149,730; 7,145,837; 7,133,941; 7,076,632; 7,051,158; 7,043,623; 6,999,956; 6,986,076; 6,985,912; 6,976,073; 6,901,512; 6,829,761; 6,829,637; 6,826,123; 6,772,218; 6,766,420; 6,694,511; 6,671,789; 6,640,245; 6,609,159; 6,604,123; 6,601,083; 6,594,671; 6,567,818; 6,560,609; 6,553,384; 6,519,601; 6,516,342; 6,505,210; 6,502,103; 6,477,527; 6,442,564; 6,430,640; 6,430,600; 6,418,447; 6,360,248; 6,356,932; 6,353,860; 6,343,312; 6,327,587; 6,321,266; 6,292,880; 6,226,637; 6,212,573; 6,185,655; 6,167,490; 6,163,801; 6,148,377; 6,138,140; 6,134,540; 6,122,627; 6,105,074; 6,081,833; 6,065,046; 6,009,266; 5,987,506; 5,987,496; 5,978,813; 5,961,606; 5,893,097; 5,765,036; and 5,729,710.