Current enterprise level virtual machine file systems, such as VMware Inc.'s VMFS, are typically shared disk file systems that utilize an external storage device, such as a storage area network (SAN), to provide storage resources to virtual machines. These virtual machines are instantiated and run on one or more servers (sometimes referred to as a server cluster) that store their virtual machines' disk images as separate files in the SAN. Each server in the cluster runs a virtualization layer (sometimes referred to as a hypervisor) that includes an implementation of a virtual machine file system that coordinates the interaction of the server with the SAN. For example, each virtual machine file system on each server in a cluster implements and follows a common per-file locking protocol that enables virtual machines running on multiple servers to simultaneously access (e.g., read and write) their disk images in the SAN without fear that other servers may simultaneously access the same disk image at the same time.
FIG. 1 depicts one example of a network architecture for a cluster of virtualization servers utilizing a SAN. Each virtualization server 100A to 100J is networked to SAN 105 and communicates with SAN 105 using SCSI-based protocols. As previously discussed, each virtualization server 100A to 100J includes a hypervisor, such as 110A, that includes a virtual machine file system, such as 115A. Hypervisor 110A provides virtualization support to enable its server 100A to instantiate a number of virtual machines, such as 120A through 125A. The disk images for each of virtual machines 120A through 125A are stored in SAN 105.
The network architecture of FIG. 1 provides protection against server failures because SAN 105 serves as a central storage resource that stores disk images for virtual machines of all the servers in the cluster. For example, if server 100A experiences a hardware failure, any of the other servers in the cluster can “failover” any of virtual machines 120A through 125A by instantiating a new virtual machine and associating the newly created virtual machine with the failed virtual machine's disk image stored in SAN 105 (i.e., provided such server has sufficient computing resources to support the virtual machine).
However, SAN 105 itself becomes a potential bottleneck and a single point of failure. Furthermore, by its nature, the use of a central SAN limits the capability to scale the number of servers in a cluster and/or distribute the servers in the cluster over a wide-area network (WAN). Additionally, SANs have traditionally been one of the most expensive components of a data center, often costing more than the aggregate cost of the virtualization software and server cluster.