1. Field of the Invention
The present invention relates to networked storage systems and, more particularly, to takeover procedures in clustered storage systems.
2. Background Information
A storage system is a computer that provides storage service relating to the organization of information on writeable persistent storage devices, such as memories, tapes or disks. The storage system is commonly deployed within a storage area network (SAN) or a network attached storage (NAS) environment. When used within a NAS environment, the storage system may be embodied as a file server including an operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on, e.g. the disks. Each “on-disk” file may be implemented as a set of data structures, e.g., disk blocks, configured to store information, such as the actual data for the file. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored.
The file server, or filer, may be further configured to operate according to a client/server model of information delivery to thereby allow many client systems (clients) to access shared resources, such as files, stored on the filer. Sharing of files is a hallmark of a NAS system, which is enabled because of semantic level of access to files and file systems. Storage of information on a NAS system is typically deployed over a computer network comprising a geographically distributed collection of interconnected communication links, such as Ethernet, that allow clients to remotely access the information (files) is on the file server. The clients typically communicate with the filer by exchanging discrete frames or packets of data according to pre-defined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP).
In the client/server model, the client may comprise an application executing on a computer that “connects” to the filer over a computer network, such as a point-to-point link, shared local area network, wide area network or virtual private network implemented over a public network, such as the Internet. NAS systems generally utilize file-based access protocols; therefore, each client may request the services of the filer by issuing file system protocol messages (in the form of packets) to the file system over the network. By supporting a plurality of file system protocols, such as the conventional Common Internet File System (CIFS), the Network File System (NFS) and the Direct Access File System (DAFS) protocols, the utility of the filer may be enhanced for networking clients.
A SAN is a high-speed network that enables establishment of direct connections between a storage system and its storage devices. The SAN may thus be viewed as an extension to a storage bus and, as such, an operating system of the storage system enables access to stored information using block-based access protocols over the “extended bus”. In this context, the extended bus is typically embodied as Fibre Channel (FC) or Ethernet media adapted to operate with block access protocols, such as Small Computer Systems Interface (SCSI) protocol encapsulation over FC (FCP) or TCP/IP/Ethernet (iSCSI). A SAN arrangement or deployment allows decoupling of storage from the storage system, such as an application server, and some level of storage sharing at the application server level. There are, however, environments wherein a SAN is dedicated to a single server. When used within a SAN environment, the storage system may be embodied as a storage appliance that manages access to information in terms of block addressing on disks using, e.g., a logical unit number (LUN) in accordance with one or more block-based protocols, such as FCP.
One example of a SAN arrangement, including a multi-protocol storage appliance suitable for use in the SAN, is described in United States Patent Application Publication No. US2004/0030668 A1, filed on Feb. 14, 2004, entitled MULTI-PROTOCOL STORAGE APPLIANCE THAT PROVIDES INTEGRATED SUPPORT FOR FILE AND BLOCK ACCESS PROTOCOLS by Brian Pawlowski et al., which is incorporated herein by reference in its entirety.
It is advantageous for the services and data provided by a storage system, such as a storage node, to be available for access to the greatest degree possible. Accordingly, some storage systems provide a plurality of storage system nodes organized as a cluster, with a first storage system node being clustered with a second storage system node. Each storage system node is configured to take over serving data access requests for the other storage system node if the other storage system node fails. The storage nodes in the cluster notify one another of continued operation using a heartbeat signal which is passed back and forth over a cluster interconnect, and over a cluster switching fabric. If one of the storage system nodes detects the absence of a heartbeat from the other storage node over both the cluster interconnect and the cluster switching fabric, a failure is detected and a takeover procedure is initiated. It is noted that the failure is also usually confirmed by the surviving storage node by checking a master mailbox disk of the other storage node to confirm that it is in fact a failure of the other storage node itself and not simply a failure of the cluster interconnect coupling.
More specifically, a mailbox mechanism includes a set of procedures for determining the most up-to-date coordinating information through the use of one or more mailbox disks. Such disks receive messages from the node with which they are associated in order to confirm that the node continues to be in communication with the mailbox disk, which indicates that the node continues to be capable of writing to the disks assigned to that node. Further details on the configuration and operation of the master mailbox disk are provided in commonly-owned U.S. Pat. No. 7,231,489, of Larson et al., for a SYSTEM AND METHOD FOR COORDINATING CLUSTER STATE INFORMATION, issued on Jun. 12, 2007, which is presently incorporated by reference herein in its entirety.
Many cluster configurations include the concept of partnering. Specifically, each is storage system node in the cluster is partnered with a second storage system node in such a manner that the partner storage system node is available to take over and provide the services and the data otherwise provided by the second storage system node. The partner assumes the tasks of processing and handling any data access requests normally processed by the second storage system node. One such example of a partnered storage system cluster configuration is described in U.S. Pat. No. 7,260,737, entitled SYSTEM AND METHOD FOR TRANSPORT-LEVEL FAILOVER OF FCP DEVICES IN A CLUSTER, by Arthur F. Lent, et al., the contents of which are hereby incorporated by reference. It is further noted that in such storage system node clusters, an administrator may desire to take one of the storage system nodes offline for a variety of reasons including, for example, to upgrade hardware, etc. In such situations, it may be advantageous to perform a “voluntary” user-initiated takeover operation, as opposed to a failover operation. After the takeover operation is complete, the storage system node's data is serviced by its partner until a giveback operation is performed.
Another example of a storage system node cluster configuration takeover technique is described in U.S. patent application Ser. No. 11/411,502, entitled SINGLE NODE NAME CLUSTER SYSTEM FOR FIBER CHANNEL, by Britt Bolen et al., the contents of which are hereby incorporated by reference. In this configuration, the cluster has a single world wide node name so that the cluster as a whole appears to the client as a single device. In such clusters, two storage system nodes are partnered such that a first storage system node serves its own “locally owned” data from the disks to which it is directly connected, and proxies requests for its partner disks to a partner storage system node. During takeover operations, the locally owned data of the failed storage system node is serviced by its partner until a give back operation is performed.
In such cases employing a partner mode, additional infrastructure is often required. For example, requests are tracked to determine whether they are partner requests. Data structures are also duplicated. Separate tables describing the data, such as for example, a volume location database (VLDB) must be maintained for the local disks and is for the partner disks. In addition, registry files which store options and configuration parameters are also maintained separately in a local registry file and a partner registry file. As will be apparent to those skilled in the art, this results in additional code complexity in many systems.
It is also noted that, in some storage system architectures, the nodes in each cluster are generally organized as a network element (N-module) and a disk element (D-module). The N-module includes functionality that enables the node to connect to clients over a computer network, while each D-module connects to one or more storage devices such as the disks of a disk array. A file system architecture of the type is generally described in U.S. Pat. No. 6,671,773, issued on Dec. 30, 2003 entitled METHOD AND SYSTEM FOR RESPONDING TO FILE SYSTEM REQUESTS, by M. Kazar et al. (the contents of which are incorporated herein by reference in entirety).
In some recent architectures however, additional functionality has been moved to the N-module which may have previously been performed by the D-module. For example, the N-module handles aspects such as network connectivity. In such configurations, it may be desirable to deliver to upper layers of the N-module a single view of all aggregates that a particular D-module is serving, rather than exposing two sets of aggregates to the N-module (i.e., a local image of the disks being served by the surviving D-module, and a set of partner disks). In previous designs, in a failover, the surviving N-module and D-module took over network addresses and performed other administrative tasks which consumed operational bandwidth in the storage architecture system.
There remains a need, therefore, for a system which eliminates partner mode failover, but allows for a takeover that results in one or more newly assimilated aggregates to be available for access by the N-modules in a multiple node cluster.