A cluster is a plurality of nodes physically connected to an inter-node communication network. Each of the cluster nodes is a computer system. The computer system may include a Central Processing Unit (“CPU”), memory, an inter-node communications interface and IO subsystem.
A storage device may be connected to the IO subsystem in a node. The storage device may be shared by a plurality of nodes by connecting the device to the IO subsystem in each node. By sharing the storage device amongst a plurality of nodes, multiple paths are provided for accessing the storage device. The multiple paths to the storage device provide redundancy in the case of a failure in one of the nodes by sending an IO request to the storage device through a non-failed node.
A well-known standard interface for connecting storage devices to an IO subsystem is the American National Standards Institute (“ANSI”). Small Computer System Interface (“SCSI”). ANSI SCSI defines a protocol for accessing storage devices connected to a storage network. The SCSI protocol permits a storage device connected to a storage network to be shared by a plurality of nodes. The IO subsystem includes in each node a storage network controller. The storage network controller includes logic for issuing IO commands over the storage network storage device. The IO commands include a command to read data from the storage device and a command to write data to the storage device.
ANSI SCSI includes a Persistent Reserve command. The Persistent Reserve command allows a storage device to be shared by more than one cluster node. Each storage network controller issues a Persistent Reserve command to the storage device to register with the storage device. A second Persistent Reserve command is issued to reserve the device by specifying the access type. The storage device stores a list of registered storage network controllers with a corresponding registration key and the type of access permitted.
The Persistent Reserve command provides security by requiring registered storage network controllers to provide their registration key before allowing the storage network controller to perform commands restricted to members of the group of registered storage network controllers. For example, if each storage network controller registers with registration type “write exclusive registrants only”, only registered storage network controllers have permission to write to the storage device but all other storage network controllers have permission to read from the storage device.
In a cluster, a node failure is communicated to survivor nodes on the inter-node communication network. Upon detecting the node failure, access to the storage device may be provided on an alternative path through survivor node in the cluster connected to the storage device. However, before access can be provided on the alternative path, all the pending IO commands issued by the failed node must be completed or aborted in the storage device in order to guarantee that these IO commands do not interfere with future IO commands from surviving cluster members. A survivor node in the cluster issues a Persistent Reserve command to the shared storage device to request the completion or abortion of all IO commands issued by the failed node in the cluster.
There are two types of SCSI physical connections. A parallel SCSI physical connection provides for the connection of a maximum of sixteen devices including storage devices and storage network controllers. A serial SCSI physical connection provides for the connection of 264 devices including storage devices and storage network controllers, switches and routers. Through the SCSI physical connection, a cluster storage device may be accessed by several cluster nodes; that is, nodes connected to a cluster and non-cluster nodes. Through the use of the Persistent Reservation command write access to a cluster storage device can be limited to registered cluster nodes by registering each cluster node with “write exclusive registrants only” registration type.
The “write exclusive registrants only” state remains in effect as long as one of the cluster nodes is registered with the storage device. However, if the persistent reservation from the last cluster node is removed, a non-cluster node or a cluster node from another cluster may write to the storage device and corrupt data stored in the storage device.