In the field of computer hardware and software technology, a virtual machine is a software implementation of a machine (computer) that executes program instructions like a real machine. Virtual machine technology allows for the sharing of, between multiple virtual machines, the physical resources underlying the virtual machines.
In virtual machine environments, a storage volume may be presented as containing a greater amount of data than an underlying storage volume that stores the data. For example, a virtual disk drive in a virtual machine environment may be presented to a user as containing 20 GB of data. However, the virtual disk file underlying the virtual disk drive may contain only 5 GB of actual data. Indeed, such an underlying storage volume may be considered a sparse storage volume.
To create sparseness, a primary storage volume is examined for strings of zeroes within a region. Because of the string of zeroes, the region can potentially be made sparse. To do so, metadata is written to the underlying storage volume that describes an empty block in the primary storage volume that has been allocated, rather than writing the entire empty block to the underlying storage volume. Over time, the underlying storage volume may become less sparse. However, the metadata that describes the storage of the data volume within the underlying storage volume can be analyzed to increase the sparseness of the underlying storage volume.
Unfortunately, this process can be very resource intensive, reducing the performance of a virtual machine and other operations within a virtual machine environment.
Overview
Disclosed are data storage systems and methods of operating data storage systems. In an embodiment, a method comprises generating first metadata describing storage of a volume of data in a first storage volume, storing the volume of data within a second storage volume, generating second metadata describing storage of the volume of data in the second storage volume, and processing the first metadata and the second metadata to increase sparseness of the volume of data stored in the second storage volume.
In an embodiment, the first storage volume comprises a virtual storage device, wherein the second storage volume comprises a virtual disk file corresponding to the virtual storage device, wherein the first metadata comprises a block bitmap for the virtual storage device, and wherein the second metadata comprises a block mapping table for the virtual disk file. The method may further comprise storing the virtual disk file on a physical storage device and storing the block bitmap in the virtual disk file.
In an embodiment, the data storage system comprises a processing system coupled to the physical storage device, a host operating system stored on the physical storage device and executable by the processing system, a hypervisor executed by the processing system and configured to provide an interface between the host operating system and a virtual machine, wherein the virtual machine comprises virtual hardware, a guest operating system and a guest application. Generating the volume of data may comprise executing the guest application to generate the volume of data. Generating the first metadata may comprise executing the guest operating system to generate the block bitmap. Generating the second metadata may comprise executing the hypervisor to generate the block mapping table.
In an embodiment, increasing the sparseness of the volume of data stored in the second storage volume using the first metadata and the second metadata comprises, in the hypervisor, creating a copy of the block mapping table, resulting in a new block mapping table, creating a copy of the volume of data from the virtual disk file, resulting in a new virtual disk file by, for each block identified in the block mapping table, if a corresponding block in the block bitmap is allocated, then copying the data in the block to the new virtual disk file and identifying the block as allocated in the new block mapping table, and, if the corresponding block in the block bitmap is not allocated, then identifying the block as unallocated in the new block mapping table. Increasing the sparseness may also include, in the physical storage device, replacing the virtual disk file with the new virtual disk file, and in the hypervisor, replacing the block mapping table with the new block mapping table.
In an embodiment, the first storage volume comprises a partitioned portion of a physical storage device, wherein the second storage volume comprises a virtual disk file corresponding to the partitioned portion, wherein the first metadata comprises a file access table for at least the partitioned portion, and wherein the second metadata comprises a block mapping table for the virtual disk file, wherein the method further comprises storing the virtual disk file on the physical storage device.
In an embodiment, processing the first metadata and the second metadata to increase the sparseness of the volume of data stored in the second storage volume comprises transforming the second storage volume from a non-sparse state to a sparse state.
In an embodiment, processing the first metadata and the second metadata to increase the sparseness of the volume of data stored in the second storage volume comprises transforming the second storage volume from a sparse state to a more-sparse state relative to the sparseness of the sparse state.
In an embodiment, a data storage system comprises a processing system configured to generate first metadata describing storage of a volume of data in a first storage volume, generate second metadata describing storage of the volume of data in a second storage volume, and process the first metadata and the second metadata to increase sparseness of the volume of data stored in the second storage volume. The data storage system further comprises a physical storage device coupled to the processing system and configured to store the second storage volume, wherein the second storage volume stores the volume of data.