Business entities and consumers are storing an ever increasing amount of digitized data. For example, many commercial entities are in the process of digitizing their business records and/or other data. Similarly, web based service providers generally engage in transactions that are primarily digital in nature. Thus, techniques and mechanisms that facilitate efficient and cost effective storage of vast amounts of digital data are being implemented.
When linking remote (or even locally dispersed) locations that require access to stored data, and/or to promote the continued availability of such data in the event of hardware, software, or even site failures (e.g., power outages, sabotage, natural disasters), entities have developed clustered networks that link disparate storage mediums to a plurality of clients, for example. Typically, to access data, one or more clients can connect to respective nodes of a clustered storage environment, where the nodes are linked by a cluster fabric that provides communication between the disparate nodes. Nodes can be dispersed locally, such as in a same geographical location, and/or dispersed over great distances, such as around the country.
A virtual server environment can comprise multiple physical controllers, such as servers, that access a distributed data storage and management system. Respective controllers may comprise a plurality of virtual machines (VMs) that reside and execute on the controller. The VM (a.k.a.: virtual server or virtual desktop) may comprise its own operating system and one or more applications that execute on the controller. As such, a VM can function as a self-contained desktop environment, for example, on the controller, emulated on a client attached to the controller, and multiple operating systems may execute concurrently on the controller.
VMs on a controller can be configured to share hardware resources of the controller, and if connected to a distributed data storage and management system (cluster), share hardware resources of the cluster. A VM monitor module/engine (hypervisor) may be used to manage the VMs on respective controllers, and also virtualize hardware and/or software resources of the controllers in the cluster for use by the VMs. Clients can be connected to the cluster and used to interface/interact with a particular VM, and emulate a desktop environment, such as a virtual desktop environment, on the client machine. From the viewpoint of a client, the VM may comprise a virtual desktop, or server that appears as an actual desktop machine environment or physical server.
Multiple VMs executing may be logically separated and isolated within a cluster to avoid conflicts or interference between applications of the different VMs. In this way, for example, a security issue or application crash in one VM may not affect the other VMs on the same controller, or in the cluster. Further, a preferred version of a VM may be cloned and deployed throughout a cluster, and transferred between controllers in the virtual server environment.
Often, a preferred version of a VM (baseline VM) is cloned a plurality of times and deployed, such as in a same controller or over a cluster, for access by attached clients. For example, virtual desktop infrastructures (VDIs) utilize cloned VMs to emulate desktop environments on clients, such as in secure working environments, and/or where retaining data for a cloned baseline VM may not be necessary. In this example, important information may be maintained on the controller or cluster, while transient data is destroyed when the baseline VM is redeployed to the clones. Redeploying a baseline VM to the clones (e.g., child VMs, comprising clones of a parent VM, such as the baseline VM) can also be used when software or configuration updates have been performed on the baseline VM, and these updates can be easily rolled out to the child VMs by redeploying the baseline. As an example, a baseline VM may be known as a parent VM and the clones from the baseline VM may be known as children (child) VMs, such that the parent VM can be redeployed to the children VMs, thus “re-baselining” the children VMs to a desired (baseline) state.
Currently, child VMs can be refreshed back to the baseline VM state, such as after changes have been made to the child VM. The refresh utility uses a copy-on-write delta file that logs any changes made to a particular child. These copy-on-write files can become quite large over time, if not refreshed, as a plurality of changes are made to the child. Further child VMs can be recomposed, which allows patches and software updates to be pushed out the child VMs from the baseline VM. A snapshot file is created of the baseline VM and rolled out to the children using a form of replication of the baseline VM.
Presently, redeploying a baseline VM to the child VMs is inefficient, and limited to merely virtual desktops and development labs due to performance problems. For example, the present use of copy-on-write files to maintain differences between a master copy and a linked clone (baseline/child relationship) provide storage and access problems. The copy-on-write files can become large quickly, and are cumbersome to manage as they have to be refreshed periodically. Further, a process used to link the master to the clones is very slow and creates storage efficiency problems.