1. Field of the Invention
The present invention relates generally to virtual machine clusters, and in particular to a method and system for sharing virtual storage objects among applications in a virtual machine cluster.
2. Description of the Related Art
Virtual machines allow organizations to make efficient use of their available computing resources. Virtual machines are often grouped into distributed clusters to maintain high availability (HA) and flexibility. One of the goals of a HA system is to minimize the impact of the failure of individual components on system availability. An example of such a failure is a loss of communications between some of the virtual machine nodes of a distributed cluster. A way to prevent data corruption following a failure of one or more nodes is to implement Input/Output (I/O) fencing.
I/O fencing is the process of isolating shared storage devices from nodes that are no longer operating as a part of the cluster to protect the data on the shared storage devices from becoming corrupted. The cluster isolates a node when it is malfunctioning to ensure that I/O operations can no longer be performed by that isolated note on the shared storage devices. When multiple nodes have access to data on shared storage devices, the integrity of the data depends on the nodes communicating with each other such that each is aware when the other accesses data on the shared storage devices. This communication occurs through connections between the nodes. If the connections between nodes are lost or if one of the nodes is hung, malfunctions, or fails, each node could be unaware of the other's activities with respect to the data on the shared storage device. This condition is known as split-brain and can lead to data corruption. To prevent the split-brain condition, I/O fencing can be utilized to isolate the non-cooperating node and control its ability to access the shared storage device. I/O fencing allows the integrity of the data to be maintained.
One method used for implementing I/O fencing of physical storage devices is based on the small computer system interface version three persistent group reservation (SCSI-3 PGR) standard. The SCSI-3 PGR standard is described in further detail in “SCSI-3 Primary Commands”, published by the American National Standards Institute, Inc., the contents of which are hereby incorporated by reference. SCSI-3 PGR based mechanisms can be used to provide I/O fencing capabilities for shared storage devices. In SCSI-3 PGR based fencing, a persistent reservation is placed on a shared storage device. This reservation grants access to a specified set of nodes while at the same time denying access to other nodes.
SCSI-3 PGR allows a node to make a physical storage device registration that is persistent across power failures and bus resets. Also, group reservations are permitted, allowing all nodes within a single group to have concurrent access to the physical storage device while restricting access to nodes not in the group. The SCSI-3 PGR standard is based on the storage, reading, and preemption of reservation keys on a reserved area (or private region) of a physical storage device. To comply with the standard, each node stores certain node-specific information on a portion of the physical storage device. Also, group reservation information from a group of nodes may also be stored in a portion of the physical storage device. This information may then be used to determine which nodes may access the storage device.
For a node to be registered, the node's registration key may be written in the node's area on the reserved portion of the shared physical device. A group reservation for all registered nodes may also be placed in a separate reserved portion of the shared physical device. In some cases, the reservation key of one node may be preempted by other nodes. The SCSI-3 PGR standard allow for preemption to ensure that only one group of nodes has access to a shared storage device in the case of a split brain scenario.
The SCSI-3 PGR standard is based on a physical hardware implementation and does not apply to virtual storage devices. Other I/O fencing standards are also limited to physical storage devices. Organizations use virtual storage devices to make their storage infrastructure more manageable and flexible. Relationships are established between physical storage devices (e.g., disk drives, tape drives) and virtual storage devices (e.g., volumes, virtual disks, virtual logical units). Using virtual storage devices provides system-wide features (e.g., naming, sizing, and management) better suited to the entire virtual machine network than those features dictated by the physical characteristics of the actual storage devices.
Therefore, what is needed in the art is a method and system for implementing I/O fencing for virtual storage objects. It would be advantageous to implement a method that would allow virtual machines to use industry standard I/O fencing methodology to access virtual storage devices. This would allow virtual machines to use standard I/O fencing application programming interface (API) calls and would not require making any significant changes to the software implemented on the virtual machines or to the underlying storage device hardware.
In view of the above, improved methods and mechanisms for implementing I/O fencing for virtual machine clusters and virtual storage objects are desired.