Virtualization technology has been widely adopted as the base infrastructure for cloud computing. Major cloud providers, such as Amazon (EC2) [1] and Microsoft (Azure) [2], are selling their computing resources in the form of virtual machines (VMs). Load balancing has become essential for effectively managing large volumes of VMs in cloud computing environment. The cornerstone for moving virtual machines on the fly is the VM live migration, which only transfers CPU and memory states of VMs from one host to another. To allow the movement of persistent storage with VMs, several live storage migration techniques have been proposed, including Dirty Block Tracking (DBT) and IO Mirroring [10][11][12].
The two mainstream techniques for live storage migration are dirty block tracking (DBT) and input/output (IO) Mirroring. The DBT technique, which is widely adopted by many VM vendors (e.g. Xen and VMware ESX), is a well-known mechanism that uses bitmap to track write requests while the VM image is being copied. Once the entire image is copied to the destination, a merge process is initiated to patch all the dirty blocks (i.e., data blocks that are recorded in bitmap) from the original image to the new image. In order to prevent further write requests, the VM is paused until all the dirty blocks are patched to the new disk image. To mitigate downtime introduced by the merge process, incremental DBT (also DBT), which keeps the VM running while iteratively patching dirty blocks to the new image, has been proposed and used in several projects [11][12][13][14]. If the number of dirty blocks is stable for several iterations, the VM is suspended and the remaining dirty blocks are copied to the destination. Nevertheless, incremental DBT also has disadvantage: in case that the number of dirty blocks are not converged due to intensive write requests, the migration time and even the downtime can be significantly long.
To address the issue of long migration time and downtime, VMware proposed IO Mirroring technique [10] to eliminate the iteratively merge process. With IO Mirroring, all the write requests to the data blocks that have been copied will be duplicated and issued to both source and destination disks. The two write requests are synchronized and then the write completion acknowledgement is asserted (synchronous write). Write requests to the data blocks that have not yet been copied will only be issued to the source disk while the writes to the data blocks that are currently being copied will be buffered and later issued to both source and destination when the being copied phase completes. By doing so, the data blocks will always be synchronized during the migration process. Note that once the process of copying VM disk image completes, merging is not needed, which leads to shorter migration time and lower downtime. However, IO Mirroring also raises some concerns: 1) workload IO performance is limited by the slower disk due to the synchronized write requests; 2) since the disk bandwidth is consumed by the duplicated IO requests, the progress of copying the VM image will be slowed down.
Virtual machine (VM) live storage migration techniques significantly increase the mobility and manageability of virtual machines during, for example, disaster recovery, storage maintenance, and storage upgrades, in the context of cloud and big data computing. Meanwhile, solid state drives (SSDs), such as flash-based SSDs have become increasingly popular in data centers, due to, for example, their high performance, silent operations and shock resistance [20, 21]. Historically, mechanical hard disk drives (HDDs) are used as the primary storage media due to their large capacity and high stability in the long run. Recently, solid-state drives (SSDs), which have high IO performance [20], are emerging as promising storage media. In specific experiments, the IOPS (10 per second) for VM running on SSDs is 3.3× higher than that on HDDs. However, SSDs also have limitations, such as low capacity, high price, and limited lifetime. The more writes and erases performed, the shorter the remaining SSD lifetime will be [7, 22]. In the commercial market, cloud storage providers, such as Morphlabs, Storm on Demand, CloudSigma and CleverKite, are selling the SSD powered cloud [4]. On the other hand, device manufacturers, such as Intel and Samsung, are researching on reliable SSD for data centers [6, 27]. Thus, from the perspective of both seller and manufacturer, SSDs have been accredited as an indispensable component for cloud and data center storage. A data center will be equipped with several disk arrays. Some of the disk arrays are SSDs while the others are HDDs. Those disk arrays are connected to servers via Fibre Channel [8, 9, 26]. With the decrease in price, SSDs have become more affordable to be used in data centers. Currently, many leading Internet service provision companies, such as Facebook, Amazon and Dropbox, are starting to integrate SSDs into their cloud storage systems [3][4][5]. The storage media for data centers becomes more diverse as both SSDs and HDDs are being used to support cloud storage. Consequently, storage management, especially VM live storage migration, becomes more complex and challenging. Accordingly, it is likely that VM live storage migration will inevitably encounter heterogeneous storage environments. Although SSDs deliver higher IO performance, their limited lifetime is an inevitable issue. In fact, while existing VM live storage migration schemes do not fully exploit the high performance characteristics of SSDs, such schemes aggravate the wear out problem. Even worse, during massive storage migrations, SSDs will be worn out significantly due to large volume of write operations.