1. Field of the Invention
This invention relates to data storage systems, and more particularly, to failure fencing in networked data storage systems.
2. Background Information
A storage system is a computer that provides storage service relating to the organization of information on writable persistent storage devices, such as memories, tapes or disks. The storage system is commonly deployed within a storage area network (SAN) or a networked storage environment. When used within a networked environment, the storage system may be embodied as a storage system including an operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on, e.g. the disks. Each “on-disk” file may be implemented as a set of data structures, e.g., disk blocks, configured to store information, such as the actual data for the file. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored.
In the client/server model, the client may comprise an application executing on a computer that “connects” to a storage system over a computer network, such as a point-to-point link, shared local area network, wide area network or virtual private network implemented over a public network, such as the Internet. NAS systems generally utilize file-based access protocols; therefore, each client may request the services of the storage system by issuing file system protocol messages (in the form of packets) to the file system over the network. By supporting a plurality of file system protocols, such as the conventional Common Internet File System (CIFS), the Network File System (NFS) and the Direct Access File System (DAFS) protocols, the utility of the storage system may be enhanced for networking clients.
A SAN is a high-speed network that enables establishment of direct connections between a storage system and its storage devices. The SAN may thus be viewed as an extension to a storage bus and, as such, an operating system of the storage system (a storage operating system, as hereinafter defined) enables access to stored information using block-based access protocols over the “extended bus.” In this context, the extended bus is typically embodied as Fibre Channel (FC) or Ethernet media (i.e., network) adapted to operate with block access protocols, such as Small Computer Systems Interface (SCSI) protocol encapsulation over FC or TCP/IP/Ethernet.
A SAN arrangement or deployment allows decoupling of storage from the storage system, such as an application server, and placing of that storage on a network. However, the SAN storage system typically manages specifically assigned storage resources. Although storage can be grouped (or pooled) into zones (e.g., through conventional logical unit number or “lun” zoning, masking and management techniques), the storage devices are still pre-assigned by a user that has administrative privileges, (e.g., a storage system administrator, as defined hereinafter) to the storage system.
Thus, the storage system, as used herein, may operate in any type of configuration including a NAS arrangement, a SAN arrangement, or a hybrid storage system that incorporates both NAS and SAN aspects of storage.
Access to disks by the storage system is governed by an associated “storage operating system,” which generally refers to the computer-executable code operable on a storage system that manages data access, and may implement file system semantics. In this sense, the NetApp® Data ONTAP™ operating system available from Network Appliance, Inc., of Sunnyvale, Calif. that implements the Write Anywhere File Layout (WAFL™) file system is an example of such a storage operating system implemented as a microkernel. The storage operating system can also be implemented as an application program operating over a general-purpose operating system, such as UNIX® or Windows NT®, or as a general-purpose operating system with configurable functionality, which is configured for storage applications as described herein.
In many high availability server environments, clients requesting services from applications whose data is stored on a storage system are typically served by coupled server nodes that are clustered into one or more groups. Examples of these node groups are Unix®-based host-clustering products. The nodes typically share access to the data stored on the storage system from a direct access storage/storage area network (DAS/SAN). Typically, there is a communication link configured to transport signals, such as a heartbeat, between nodes such that during normal operations, each node has notice that the other nodes are in operation.
In the case of a two-node cluster, for example, the absence of a heartbeat signal indicates to a node that there has been a failure of some kind. However, if both nodes are still in normal operating condition, the absent heartbeat signal may be the result of interconnect failure. In that case, the nodes are not in communication with one another and, typically, only one node should be allowed access to the shared storage system. In addition, a node that is not properly functioning may need to have its access to the data of the storage system restricted.
But, in a networked storage device, access to a storage system is typically through a conventional file system protocol, such as the network file system (NFS) protocol. Thus, any techniques that are used to restrict access to data with respect to a NAS device would need to incorporate the NFS protocol. Moreover, the NFS protocol does not support SCSI reservations, and thus prior techniques which relied on SCSI reservations would not be suitable for an environment in which access to the storage system is through NFS. Thus, a network accessed storage system does not fit into this traditionally disk-based host cluster model.
There remains a need, therefore, for a host cluster environment that includes failure fencing but that can support NFS data access from a networked clustered environment that is interfaced with the storage system.
There remains a further need for performing fencing operations without requiring a traditional SCSI-based reservation mechanism when a cluster does not predominantly share data from a directly attached disk, but instead functions in a networked storage environment.
In addition, there remains a need for a simple user interface adapted to perform fencing operations for the cluster, which can be easily downloaded into the host clustering framework.