A “virtual machine” or a “VM” refers to a specific software-based implementation of a machine in a virtualization environment, in which the hardware resources of a real computer (e.g., CPU, memory, etc.) are virtualized or transformed into the underlying support for the fully functional virtual machine that can run its own operating system and applications on the underlying physical resources just like a real computer.
Virtualization works by inserting a thin layer of software directly on the computer hardware or on a host operating system. This layer of software contains a virtual machine monitor or “hypervisor” that allocates hardware resources dynamically and transparently. Multiple operating systems run concurrently on a single physical computer and share hardware resources with each other. By encapsulating an entire machine, including CPU, memory, operating system, and network devices, a virtual machine is completely compatible with most standard operating systems, applications, and device drivers. Most modern implementations allow several operating systems and applications to safely run at the same time on a single computer, with each having access to the resources it needs when it needs them.
Virtualization allows one to run multiple virtual machines on a single physical machine, with each virtual machine sharing the resources of that one physical computer across multiple environments. Different virtual machines can run different operating systems and multiple applications on the same physical computer.
One reason for the broad adoption of virtualization in modern business and computing environments is because of the resource utilization advantages provided by virtual machines. Without virtualization, if a physical machine is limited to a single dedicated operating system, then during periods of inactivity by the dedicated operating system the physical machine is not utilized to perform useful work. This is wasteful and inefficient if there are users on other physical machines which are currently waiting for computing resources. To address this problem, virtualization allows multiple VMs to share the underlying physical resources so that during periods of inactivity by one VM, other VMs can take advantage of the resource availability to process workloads. This can produce great efficiencies for the utilization of physical devices, and can result in reduced redundancies and better resource cost management.
Data Centers are typically architected as diskless computers (“application servers”) talking to a set of networked storage appliances (“storage servers”) via a Fiber Channel or Ethernet network. A storage server exposes volumes that are mounted by the application servers for their storage needs. If the storage server is a block-based server, it exposes a set of volumes that are also called Logical Unit Numbers (LUNs). If, on the other hand, a storage server is file-based, it exposes a set of volumes that are also called file systems. Either way, a volume is the smallest unit of administration for a storage device, e.g., a storage administrator can set policies to backup, snapshot, RAID-protect, or WAN-replicate a volume, but cannot do the same operations on a region of the LUN, or on a specific file in a file system.
Storage devices comprise one type of physical resources that can be managed and utilized in a virtualization environment. For example, VMWare is a company that provides products to implement virtualization, in which networked storage devices are managed by the VMWare virtualization software to provide the underlying storage infrastructure for the VMs in the computing environment. The VMWare approach implements a file system (VMFS) that exposes emulated storage hardware to the VMs. The VMWare approach uses VMDK “files” to represent virtual disks that can be accessed by the VMs in the system Effectively, a single volume can be accessed and shared among multiple VMs.
While this known approach does allow multiple VMs to perform I/O activities upon shared networked storage, there are also numerous drawbacks and inefficiencies with this approach. For example, the VMWare approach only allows access to networked storage, and does not have the ability to use direct attached storage as part of a shared storage architecture. While the virtualization administrator needs to manage VMs, the storage administrator is forced to manage coarse-grained volumes that are shared by multiple VMs. Configurations such as backup and snapshot frequencies, RAID properties, replication policies, performance and reliability guarantees etc. continue to be at a volume level, and that is problematic. Moreover, this conventional approach does not allow for certain storage-related optimizations to occur in the primary storage path.
Related application Ser. No. 13/207,345 describes an improved architecture for managing I/O and storage devices in a virtualization environment. The approach of Nutanix-001 provides for specially configured virtual machines (referred to as “Service VMs”) to control and manage any type of storage device, including directly attached storage in addition to networked and cloud storage. The Service VM implements a storage controller in the user space, and can virtualizes I/O access to storage hardware. IP-based requests are used to send I/O request to the Service VMs. The Service VM can directly implement storage and I/O optimizations within the direct data access path, without the need for add-on products. Related application Ser. No. 13/207,357 describes an approach for using advanced metadata to implement the architecture for managing I/O operations and storage devices for a virtualization environment. The advanced metadata is used to track data across the storage devices. A lock-free approach is implemented in some embodiments to access and modify the metadata.
The problem addressed by the present application is that certain maintenance and optimization tasks should be performed to manage the operations of a virtual storage system. For example, garbage collection tasks should be performed when using advanced metadata techniques as described in the related application Ser. No. 13/207,357. However, conventional technologies do not provide efficient or scalable solutions to these maintenance and optimization requirements.