A “virtual machine” or a “VM” refers to a specific software-based implementation of a machine in a virtualization environment, in which the hardware resources of a real computer (e.g., CPU, memory, etc.) are virtualized or transformed into the underlying support for the fully functional virtual machine that can run its own operating system and applications on the underlying physical resources just like a real computer.
Virtualization works by inserting a thin layer of software directly on the computer hardware or on a host operating system. This layer of software contains a virtual machine monitor or “hypervisor” that allocates hardware resources dynamically and transparently. Multiple operating systems run concurrently on a single physical computer and share hardware resources with each other. By encapsulating an entire machine, including CPU, memory, operating system, and network devices, a virtual machine is completely compatible with most standard operating systems, applications, and device drivers. Most modern implementations allow several operating systems and applications to safely run at the same time on a single computer, with each having access to the resources it needs when it needs them.
Virtualization allows one to run multiple virtual machines on a single physical machine, with each virtual machine sharing the resources of that one physical computer across multiple environments. Different virtual machines can run different operating systems and multiple applications on the same physical computer.
One reason for the broad adoption of virtualization in modern business and computing environments is because of the resource utilization advantages provided by virtual machines. Without virtualization, if a physical machine is limited to a single dedicated operating system, then during periods of inactivity by the dedicated operating system the physical machine is not utilized to perform useful work. This is wasteful and inefficient if there are users on other physical machines which are currently waiting for computing resources. To address this problem, virtualization allows multiple VMs to share the underlying physical resources so that during periods of inactivity by one VM, other VMs can take advantage of the resource availability to process workloads. This can produce great efficiencies for the utilization of physical devices, and can result in reduced redundancies and better resource cost management.
The underlying data in a virtualization environment may be in the form of a distributed file system (such as Network File System or NFS). In order to access particular data content, a client will often, through an NFS server, first consult the metadata for location information of the desired data content, before accessing the data content at the identified location. However, while clients are typically able to read from the metadata through a plurality of different data servers, in order to write or update the metadata of the distributed file system (e.g., rewrite metadata, change ownership of metadata, etc.), a dedicated metadata server is typically used. The dedicated metadata server serializes requests to modify or update the metadata, in order to preserve data integrity. However, because the dedicated metadata server functions as a chokepoint when processing metadata updates and modifications, the scalability of the system is limited.
Some approaches use a distributed metadata server by partitioning the metadata, and using different data servers to act as masters for different partitions. However, this presents difficulties for operations across different partitions of the metadata. In addition, partitioning the metadata may be inadequate from a load balancing standpoint, as it may not always be possible to determine how the requests to update the metadata will be distributed between different partitions of the metadata.
Therefore, there is a need for an improved approach to implementing distributed metadata in a virtualization environment.