Virtualization is a technology that provides a software-based abstraction to a physical, hardware-based computer. In conventional solutions, an abstraction layer decouples physical hardware components (e.g., central processing unit (“CPU”), memory, disk drives, storage) from an operating system and allows numerous instances to be run side-by-side as virtual machines (“VM”) in isolation of each other. In conventional solutions, an operating system within a virtual machine has visibility into and can perform data transactions with a complete, consistent, and normalized set of hardware regardless of the actual individual physical hardware components underneath the software-based abstraction.
Virtual machines, in conventional solutions, are encapsulated as files (also referred to as images) making it possible to save, replay, edit and copy a virtual machine in a manner similar to that of handling a file on a file-system. This capability is fundamental to improving manageability, increasing flexibility, and enabling rapid administration as compared to using physical machines to replace those that are abstracted.
However, virtual machines suffer from significant shortcomings as VM files tend to be large in size and consume large amounts of disk space. Additionally, each VM image is identical to other VMs in conventional desktop usage solutions as most VMs tend to have the same version and identical copy of an operating system, applications, which are typically used in similar ways. Unfortunately, conventional solutions create unnecessary redundancy and overhead as VM files are continuously read from and written to with duplicate (identical and redundant) information. Overhead, in conventional solutions, includes storage capacity overhead (i.e., each virtual machine takes several tens of gigabytes on average to store), storage access overhead (i.e., VMs are typically stored on storage systems that are shared and every time a VM file needs to be read or written to, network and storage resources are required to perform the operation), and network overhead (i.e., transferring data to and from storage systems utilizes network bandwidth and is affected by latency). In some conventional solutions, out-of-band (i.e., using a connection for control data that is different from a connection used for main (e.g., payload) data) or post-processing of data in write operations is limited in that data is first written in its full form and then reduced and re-written to storage, which is resource-intensive for storage and processing resources. Another problem with conventional solutions is that block-level range locking (i.e., preventing two write operations being performed to a given block of data) may be required because a VM may have write operations that are in-flight and not written to storage (i.e., on disk) yet, thus degrading performance and introducing opportunities for conflict or latency. Similarly, some conventional solutions also require block locking because individual blocks could be changed and not affected on disk, which also degrades overall system performance. Similarly, read operations in conventional solutions are inefficient because in-band recomposition or rehydration must occur. A read operation results in reassembly of blocks of data based on indexes, which is both computationally and mechanically intensive (i.e., disk-seek activities are increased). Regarding conventional shared storage systems, additional queuing ensues as multiple applications compete for requests to be serviced by a storage system. As desktops are interactive workloads, these are sensitive to latency and timeouts and, subsequently, network overhead results in poor performance.
Thus, what is needed is a solution for improving data handling in a virtualized desktop environment without the limitations of conventional techniques.