Conventional solutions for virtualization technology provide numerous capabilities to efficiently deliver applications and desktops by packaging them as virtual machines. Virtualization is a technology that provides a software based abstraction to a physical hardware based computer. The abstraction layer decouples the physical hardware components—CPU, memory, and disk from the Operating System (OS) and thus allows many instances of an OS to be run side-by-side as virtual machines (VMs) in complete isolation to one another. The OS within each virtual machine sees a complete, consistent and normalized set of hardware regardless of the actual physical hardware underneath the software based abstraction. Virtual machines are encapsulated as files (also referred to as images) making it possible to save, replay, edit, copy, cut, and paste the virtual machine like any other file on a file-system. This ability is fundamental to enabling better manageability and more flexible and quick administration compared to physical virtual machines.
These benefits notwithstanding, conventional VMs suffer from several performance related weaknesses that arise out of the way the VM interfaces with the storage sub-system(s) that stores the VM images or files. Those performance weaknesses include but are not limited to the following examples.
First, every read operation or write operation performed by every single VM (and there can be hundreds if not thousands of VMs performing such operations concurrently) is serviced in a queue by the storage system. This creates a single point of contention that results in below-par performance.
Second, the storage system usually blocks all write operations until a read request is fulfilled. Therefore, the preference given to read IO's results in data that flows in fits and bursts as the storage system comes under load. In more advanced storage architectures, storage pools are created to isolate applications from being blocked by each other but the effect is still experienced within the pool.
Third, there are numerous latencies that develop as input/output (IO) is queued at various points in an IO stack from a VM hypervisor to the storage system. Examples of latencies include but are not limited to: (a) when an application residing inside a Guest OS issues an IO, that IO gets queued to the Guest OS's Virtual Adapter driver; (b) the Virtual Adapter driver then passes the IO to a LSI Logic/BusLogic emulator; (c) the LSI Logic/BusLogic emulator queues the IO to a VMkernel's Virtual SCSI layer, and depending on the configuration, IOs are passed directly to the SCSI layer or are passed thru a Virtual Machine File System (VMFS) file system before the IO gets to the SCSI layer; (d) regardless of the path followed in (c), ultimately all IOs will end up at the SCSI layer; and (e) IOs are then sent to a Host Bus Adapter driver queue. From then on, IOs hit a disk array write cache and finally a back-end disk. Each example in (a)-(e) above introduces various degrees of latency.
Fourth, Least Recently Used (LRU)/Least Frequently Used (LFU)/Adaptive Replacement (ARC) cache replacement techniques all ultimately rely on building a frequency histogram of block storage access to determine a value for keeping or replacing a block from cache memory. Therefore, storage systems that rely on these cache management techniques will not be effective when servicing virtualization workloads especially Desktop VMs as the working set is too diverse for these techniques to manage cache consolidation and not cause cache fragmentation.
Fifth, in a virtualization environment, there typically exist multiple hierarchical caches in different subsystems—i.e. the Guest OS, the VM Hypervisor and a Storage Area Network (SAN)/Network Attached Storage (NAS) storage layer. As all the caches are independent of each other and unaware of each other, each cache implements the same cache replacement policies (e.g., algorithms) and thus end up all caching the same data within each independent cache. This results in an inefficient usage of the cache as cache capacity is lost to storing the same block multiple times. This is referred to as the cache inclusiveness problem and cannot be overcome without the use of external mechanisms to co-ordinate the contents of the multiple hierarchical caches in different subsystems.
Finally, SAN/NAS based storage systems that are under load ultimately will always be at a disadvantage to service virtualization workloads as they will need to service every IO operation from disk as cache will be overwhelmed and fragment in the face of a large working set and because of diminished capacity within the caches due to the aforementioned cache inclusiveness problem.
The above performance weakness examples are a non-exhaustive list and there are other performance weaknesses in conventional virtualization technology.
There are continuing efforts to improve processes, cache techniques, software, data structures, hardware, and systems for virtualization technology.
Although the above-described drawings depict various examples of the invention, the invention is not limited by the depicted examples. It is to be understood that, in the drawings, like reference numerals designate like structural elements. Also, it is understood that the drawings are not necessarily to scale.