The present disclosure relates generally to information handling systems, and more particularly to a system to refactor virtual data storage hierarchies using an information handling system.
As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option is an information handling system (IHS). An IHS generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes. Because technology and information handling needs and requirements may vary between different applications, IHSs may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in IHSs allow for IHSs to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, IHSs may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
A virtual disk drive is generally known in the art as a data storage drive, such as a hard disk drive, a floppy drive, a cd/dvd drive, a solid state drive, main memory, network sharing, or others, where the data storage drive is emulated in some fashion by an IHS. It should be understood that a virtual disk drive may be any type of data storage device and does not necessarily require a disk drive. Some virtual data storage formats such as, virtual hard disk drive file formats, provide a feature called “differencing disks” that can be used to save physical storage space and improve the manageability of a similar operating system image across multiple virtual machines. A differencing disk/tree generally allows one to create a data storage drive from a parent drive and all changes from that point will go to the new drive. Thus, the data on the parent drive will not be further modified. As such, the original data may be maintained on the parent drive and the changed data may be saved to the new drive.
FIG. 1 illustrates a block diagram of a prior art differencing drive system in which a base virtual data storage drive is created by installing a common operating system onto it. This drive is then “locked” and becomes the root of a differencing tree (or hierarchy). For each virtual machine that will use this operating system, a second, subordinate differencing virtual drive is created. All writes the virtual machine makes are capture in the differencing drive. Reads for a block of data pull from this drive first, and fall through to the base drive if the virtual machine has never written that block of data. Data storage space savings issues arise from the common, unchanged blocks of data being represented only once on physical storage device, especially when combined with the use of dynamic (sparse) drive representations. Improved manageability is a result of having to perform an installation of the base operating system only once, and then “forking” it as many times as needed for virtual machines that will be based upon it. Note that the differencing hierarchy can be an arbitrary tree as shown in FIG. 2, where each leaf node is assigned to a virtual machine, and all interior nodes are “locked”.
A problem with this type of virtual drive system, is that, over time, the differencing drives begin to fill up with blocks of data that have the same content across different virtual machines. Consider, for example, applying an operating system patch to virtual drive system. Ideally, the patch would be applied to the root node, but that node is “locked” and cannot be re-written. Therefore, the same data contents are written to each differencing drive. Furthermore, the common data content will not likely be written to the same block locations on each drive. Other systems block de-duplication using signatures to identify similar blocks of data for consolidation. Thus, differencing disks generally avoid duplication in a “forward” direction, meaning that the single instances of blocks are planned up-front.
Accordingly, it would be desirable to provide improved refactoring for virtual data storage hierarchies absent the disadvantages discussed above.