The present disclosure generally relates to techniques for enabling coarse-grained volume snapshots for virtual machine backup and restore and, more specifically, to techniques for enabling coarse-grained volume snapshots for virtual machine backup and restore while minimizing performance impact and virtual disk footprint.
In general, cloud computing refers to Internet-based computing where shared resources, software, and information are provided to users of computer systems and other electronic devices (e.g., mobile phones) on demand, similar to the electricity grid. Adoption of cloud computing has been aided by the widespread utilization of virtualization, which is the creation of a virtual (rather than actual) version of something, e.g., an operating system, a server, a storage device, network resources, etc. A virtual machine (VM) is a software implementation of a physical machine (PM), e.g., a computer system, that executes instructions like a PM. VMs are usually categorized as system VMs or process VMs. A system VM provides a complete system platform that supports the execution of a complete operating system (OS). In contrast, a process VM is usually designed to run a single program and support a single process. A VM characteristic is that application software running on the VM is limited to the resources and abstractions provided by the VM. System VMs (also referred to as hardware VMs) allow the sharing of the underlying PM resources between different VMs, each of which executes its own OS. The software that provides the virtualization and controls the VMs is typically referred to as a VM monitor (VMM) or hypervisor. A hypervisor may run on bare hardware (Type 1 or native VMM) or on top of an operating system (Type 2 or hosted VMM).
Cloud computing provides a consumption and delivery model for information technology (IT) services based on the Internet and involves over-the-Internet provisioning of dynamically scalable and usually virtualized resources. Cloud computing is facilitated by ease-of-access to remote computing websites (e.g., via the Internet or a private corporate network) and frequently takes the form of web-based tools or applications that a cloud consumer can access and use through a web browser, as if the tools or applications were a local program installed on a computer system of the cloud consumer. Commercial cloud implementations are generally expected to meet quality of service (QoS) requirements of consumers and typically include service level agreements (SLAs). Cloud consumers avoid capital expenditures by renting usage from a cloud vendor (i.e., a third-party provider). In a typical cloud implementation, cloud consumers consume resources as a service and pay only for resources used.
The IBM® Storwize® V7000 system provides copy services features that facilitate copying volumes or logical unit numbers (LUNs). The Storwize® V7000 FlashCopy function transfers a point-in-time copy of a source volume to a designated target volume. In its basic mode, the FlashCopy function copies the contents of a source volume to a target volume. In this case, any data that existed on the target volume is lost and is replaced by the copied data. The FlashCopy function is sometimes described as an instance of a time-zero (T0) copy technology. Although it is difficult to make a consistent copy of a dataset that is constantly updated, point-in-time copy techniques facilitate consistently copying datasets that are constantly updated.
If a copy of a dataset is created using a technology that does not provide point-in-time techniques and the dataset changes during the copy operation, the resulting copy may contain data that is not consistent. For example, if a reference to an object is copied earlier than the object itself and the object is moved before it is copied, the copy contains the referenced object at its new location, but the copied reference still points to the previous location. More advanced FlashCopy functions allow operations to occur on multiple source and target volumes. FlashCopy management operations are coordinated to provide a single common point-in-time for copying source volumes to their respective target volumes. In this manner, the FlashCopy function may be used to create a consistent copy of data that spans multiple volumes. The FlashCopy function also allows multiple target volumes to be copied from each source volume to facilitate creating images from different points-in-time for each source volume.
When a volume is created, the volume can be designated as a thin-provisioned volume with a virtual capacity and a real capacity. The virtual capacity is the volume storage capacity that is available to a host. The real capacity is the storage capacity that is allocated to a volume from a storage pool. In a fully allocated volume, the virtual capacity and real capacity are the same. In a thin-provisioned volume, however, the virtual capacity can be much larger than the real capacity. Each system uses the real capacity to store data that is written to the volume and metadata that describes the thin-provisioned configuration of the volume. As more information is written to the volume, more of the real capacity is used.
Thin-provisioned volumes can help simplify server administration. For example, instead of assigning a volume with some capacity to an application and increasing that capacity as the needs of the application change, a volume can be configured with a large virtual capacity for the application, and the real capacity can be increased or decreased, as storage requirements of the application change, without disrupting the application or server. However, input/output (I/O) rates that are obtained from thin-provisioned volumes can be slower than those obtained from fully allocated volumes that are allocated on the same managed disk due to the need to access and process the extra metadata describing the contents of thin-provisioned volumes.