A storage system is a computer that provides storage service relating to the organization of information on writeable persistent storage devices, such as memories, tapes or disks. The storage system is commonly deployed within a storage area network (SAN) or a network attached storage (NAS) environment. When used within a NAS environment, the storage system may be embodied as a file server including an operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on, e.g. the disks. Each “on-disk” file may be implemented as a set of data structures, e.g., disk blocks, configured to store information, such as the actual data for the file. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored.
The file server, or filer, may be further configured to operate according to a client/server model of information delivery to thereby allow many client systems (clients) to access shared resources, such as files, stored on the filer. Sharing of files is a hallmark of a NAS system, which is enabled because of semantic level of access to files and file systems. Storage of information on a NAS system is typically deployed over a computer network comprising of a geographically distributed collection of interconnected communication links, such as Ethernet, that allow clients to remotely access the information (files) on the file server. The clients typically communicate with the filer by exchanging discrete frames or packets of data according to pre-defined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP).
In the client/server model, the client may comprise an application executing on a computer that “connects” to the filer over a computer network, such as a point-to-point link, shared local area network, wide area network or virtual private network implemented over a public network, such as the Internet. NAS systems generally utilize file-based access protocols; therefore, each client may request the services of the filer by issuing file system protocol messages (in the form of packets) to the file system over the network. By supporting a plurality of file system protocols, such as the conventional Common Internet File System (CIFS), the Network File System (NFS) and the Direct Access File System (DAFS) protocols, the utility of the filer may be enhanced for networking clients.
A SAN is a high-speed network that enables establishment of direct connections between a storage system and its storage devices. The SAN may thus be viewed as an extension to a storage bus and, as such, an operating system of the storage system enables access to stored information using block-based access protocols over the “extended bus”. In this context, the extended bus is typically embodied as Fibre Channel (FC) or Ethernet media adapted to operate with block access protocols, such as Small Computer Systems Interface (SCSI) protocol encapsulation over FC (FCP) or TCP/IP/Ethernet (iSCSI). A SAN arrangement or deployment allows decoupling of storage from the storage system, such as an application server, and some level of storage sharing at the application server level. There are, however, environments wherein a SAN is dedicated to a single server. When used within a SAN environment, the storage system may be embodied as a storage appliance that manages data access to a set of disks using one or more block-based protocols, such as SCSI embedded in Fibre Channel (FCP). One example of a SAN arrangement, including a multi-protocol storage appliance suitable for use in the SAN, is described in U.S. patent application Ser. No. 10/215,917, entitled MULTI-PROTOCOL STORAGE APPLIANCE THAT PROVIDES INTEGRATED SUPPORT FOR FILE AND BLOCK ACCESS PROTOCOLS, by Brian Pawlowski, et al.
It is advantageous for the services and data provided by a storage system, such as a storage appliance to be available for access to the greatest degree possible. Accordingly, some storage systems provide a plurality of storage appliances in a cluster, with a property that when a first storage appliance fails, the second storage appliance (“partner”) is available to take over and provide the services and the data otherwise provided by the first storage appliance. When the first storage appliance fails, the second partner storage appliance in the cluster assumes the tasks of processing and handling any data access requests normally processed by the first storage appliance. One such example of a storage appliance cluster configuration is described in U.S. patent application Ser. No. 10/421,297, entitled SYSTEM AND METHOD FOR TRANSPORT-LEVEL FAILOVER OF FCP DEVICES IN A CLUSTER, by Arthur F. Lent, et al. An administrator may desire to take a storage appliance offline for a variety of reasons including, for example, to upgrade hardware, etc. In such situations, it may be advantageous to perform a user-initiated takeover operation, as opposed to a failover operation. After the takeover operation is complete, the storage appliance's data will be serviced by its partner until a giveback operation is performed.
In certain known storage appliance cluster configurations, the transport medium used for communication between clients and the cluster is Fibre Channel (FC) cabling utilizing the FCP protocol (SCSI embedded in FC) for transporting data. In SCSI terminology, clients operating in a SAN environment are initiators that initiate requests and commands for data. The multi-protocol storage appliance is thus a target configured to respond to the requests issued by the initiators in accordance with a request/response protocol. According to the FC protocol, initiators and targets have three unique identifiers, a Node Name, a Port Name and a Device Identifier. The Node Name and Port Name are worldwide unique, e.g. World Wide Node Name (WWNN) and World Wide Port Name (WWPN). A Device Identifier is unique within a given FC switching fabric and is assigned dynamically to an FC port by, e.g., a FC switch coupled thereto.
In conventional failover techniques involving clusters of storage appliances, each storage appliance in the cluster maintains two physical FC ports, namely an A port and a B port. The A port is utilized for processing and handling data access requests directed to the storage appliance. The B port typically is in a standby mode; when a failover situation occurs, the B port is activated and “assumes the identity” of its failed partner storage appliance. At that point, the B port functions as a FC target to receive and handle data access requests directed to the failed storage appliance. In this way, the surviving storage appliance may process requests directed to both the storage appliance and its failed partner storage appliance. Such a conventional FC failover is further described in the above-referenced patent application entitled SYSTEM AND METHOD FOR TRANSPORT-LEVEL FAILOVER OF FCP DEVICES IN A CLUSTER.
Typically, a port of a “surviving” storage appliance assumes the identity of its failed partner storage appliance by servicing data access requests direct to a WWNN and a WWPN of the partner. For many client operating systems, this is sufficient to permit clients to transparently access the surviving storage appliance as if it were the failed storage appliance. After the surviving storage appliance assumes the identity of the failed storage appliance, data access requests directed to the network address of the failed storage appliance are received and processed by the surviving storage appliance. Although it may appear to the clients as if the failed storage appliance was momentarily disconnected and reconnected to the network, data operations or data access requests continue to be processed.
However, other client operating systems, including, for example the well known HP/UX and AIX operating systems, utilize an FC device ID (DID) in addition to the WWPN and WWNN to identify a FC target. Clients utilizing such operating systems are thus unable to access a surviving storage appliance that assumes the identity of its failed partner, as described above. Additionally, these operating systems require that all network “paths” to the target, including the WWNN, WWPN and DID, are known during the original configuration of the client. This is typically accomplished by the client performing an input/output (I/O) scan of all connected device targets during system initialization. Accordingly, where clients utilize operating systems that require the use of a DID or that require prior knowledge of all available paths to a target, conventional failover techniques do not ensure continued connectivity.
Another noted problem with certain storage appliance cluster configurations occurs when a network path from a client to a storage appliance of a cluster fails. In such a situation, the storage appliance remains operational, but has lost network connectivity with the client. This may occur as a result of, for example, a failure of a switch in the network, improper cabling or failure of the physical transport medium. Often, the client may retain a network path to the other storage appliance in the cluster by, for example, a redundant data path via a second switch, etc. However, since both storage appliances are functioning correctly, the cluster will typically not perform a failover operation. Yet, clients are unable to access data stored within the storage appliance cluster because of the loss of connectivity.
In a SCSI proxying environment, such as that described in U.S. patent application Ser. No. 10/811,095, entitled SYSTEM AND METHOD FOR PROXYING DATA ACCESS COMMANDS IN A CLUSTERED STORAGE SYSTEM, by Herman Lee, et al, a number of operations are sent over a cluster interconnect coupling the storage appliances of the cluster. The protocol utilized across the cluster interconnect is a block-based protocol similar to the SCSI protocol, which requires a number of messages to be transmitted across the cluster interconnect for any data access operation. For example, to perform a read operation three messages are required, namely, (i) a block-based read request sent by the storage appliance receiving the request (“the local storage appliance”) to the partner storage appliance, (ii) a response issued by partner storage appliance, the response including the requested data and a status indicator, and (iii) a completion message issued by the local storage appliance in response to the partner's message, the completion message instructing the partner to “clean up” allocated buffers and to signify that the operation is complete.
To perform a write operation, additional messages are required, which results in five cross-interconnect messages. In the write situation, the local storage appliance sends the write request to the partner, which then responds with a request to transfer (R2T) message signifying that the partner is requesting to transmit the write data. In response to the R2T message, the local storage appliance sends the write data. The partner storage appliance then sends a status message once the data has been received and finally the local storage appliance sends a completion/cleanup message. As can be appreciated, there are a number of messages passed across the cluster interconnect in order to perform data access (read/write) operations in a SCSI proxying environment. Passing of such messages involves a substantial time delay (latency) in processing a data access operation.