A storage system typically comprises one or more storage devices into which information may be entered, and from which information may be obtained, as desired. The storage system includes a storage operating system that functionally organizes the system by, inter alia, invoking storage operations in support of a storage service implemented by the system. The storage system may be implemented in accordance with a variety of storage architectures including, but not limited to, a network-attached storage environment, a storage area network and a disk assembly directly attached to a client or host computer. The storage devices are typically disk drives organized as a disk array, wherein the term “disk” commonly describes a self-contained rotating magnetic media storage device. The term disk in this context is synonymous with hard disk drive (HDD) or direct access storage device (DASD).
Storage of information on the disk array is preferably implemented as one or more storage “volumes” of physical disks, defining an overall logical arrangement of disk space. The disks within a volume are typically organized as one or more groups, wherein each group may be operated as a Redundant Array of Independent (or Inexpensive) Disks (RAID). RAID implementations enhance the reliability/integrity of data storage through the redundant writing of data “stripes” across a given number of physical disks in the RAID group, and the appropriate storing of redundant information (parity) with respect to the striped data. As described herein, a volume typically comprises at least one data disk and one associated parity disk (or possibly data/parity partitions in a single disk) arranged according to a RAID 4 or equivalent high-reliability implementation. The term “RAID” and its various implementations are well-known and disclosed in A Case for Redundant Arrays of Inexpensive Disks (RAID), by D. A. Patterson, G. A. Gibson and R. H. Katz, Proceedings of the International Conference on Management of Data (SIGMOD), June 1988.
The storage operating system of the storage system may implement a high-level module, such as a file system, to logically organize the information stored on the disks as a hierarchical structure of data containers, such as files and blocks. For example, each “on-disk” file may be implemented as a set of data structures, i.e., disk blocks, configured to store information, such as the actual data for the file. These data blocks are organized within a volume block number (vbn) space that is maintained by the file system. The file system may also assign each data block in the file a corresponding “file offset” or file block number (fbn). The file system typically assigns sequences of fbns on a per-file basis, whereas vbns are assigned over a larger volume address space. The file system organizes the data blocks within the vbn space as a “logical volume”; each logical volume may be, although is not necessarily, associated with its own file system.
A known type of file system is a write-anywhere file system that does not over-write data on disks. If a data block is retrieved (read) from disk into a memory of the storage system and “dirtied” (i.e., updated or modified) with new data, the data block is thereafter stored (written) to a new location on disk to optimize write performance. A write-anywhere file system may initially assume an optimal layout such that the data is substantially contiguously arranged on disks. The optimal disk layout results in efficient access operations, particularly for sequential read operations, directed to the disks. An example of a write-anywhere file system that is configured to operate on a storage system is the Write Anywhere File Layout (WAFL®) file system available from Network Appliance, Inc., Sunnyvale, Calif.
The storage system may be further configured to operate according to a client/server model of information delivery to thereby allow many clients to access data containers stored on the system. In this model, the client may comprise an application, such as a database application, executing on a computer that “connects” to the storage system over a computer network, such as a point-to-point link, shared local area network (LAN), wide area network (WAN), or virtual private network (VPN) implemented over a public network such as the Internet. Each client may request the services of the storage system by issuing file-based and block-based protocol messages (in the form of packets) to the system over the network. In the case of block-based protocol packets, the client requests (and storage system responses) address the information in terms of block addressing on disk using, e.g., a logical unit number (lun). These block-base protocol packets may comprise SCSI encapsulated in TCP/IP (iSCSI).
In such block-based storage system environments, the luns exported by a storage system are only available by accessing that particular system. It should be noted that the term “lun” as used herein may refer to a logical unit number and/or a logical unit. A noted disadvantage of such environments arises when the storage system suffers an error or otherwise becomes inaccessible due to, e.g., a failure in network connectivity. As luns are only available by accessing the storage system, those luns become inaccessible should the storage system become inaccessible. Such inaccessibility is unacceptable for many users of SANs who require high, e.g., “24×7” data availability.
To improve the availability of luns, storage systems may be coupled together in a cluster with the property that when one storage system fails the other begins servicing data access requests directed to the failed storage system's luns. In such an environment, two storage systems are coupled to form a storage system cluster. Each storage system services data access requests directed to its luns and only services data access requests directed to the other storage system's luns after a failover operation has occurred. During the failover operation, the surviving storage system, i.e., the storage system that has not suffered the error condition, assumes the identity of the failed storage system by, for example, assigning the failed storage system's network portals, i.e., the Internet Protocol (IP) addresses and TCP port numbers, to network adapters available on the surviving storage system. However, a noted disadvantage of such clusters is that they are limited to two storage systems.
In certain distributed environments such as that described in U.S. patent application Ser. No. 11/254,397, entitled SYSTEM AND METHOD FOR PROVIDING A UNIFIED ISCSI TARGET WITH A PLURALITY OF LOOSELY COUPLED ISCSI FRONT ENDS, a plurality of storage systems may be utilized as front ends to a cluster comprising network elements (N-modules) and disk elements (D-modules). In such environments, a conventional TCP/IP failover pairing may be established between any two N-modules.
Similarly, in a storage system environment utilizing iSCSI, all network portals may failover to one storage system. The iSCSI protocol defines a network portal as an IP address and a TCP port number from which a computer provides iSCSI services. In accordance with the iSCSI protocol each network portal may belong to exactly one target portal group (TPG). All connections within an iSCSI session must use network portals within the same TPG. Furthermore, a given initiator may have at most one session in progress to an iSCSI target over a given TPG at a given time. A noted disadvantage of the prior art arises when pair-wise cluster extends to an N-way cluster having a plurality of storage systems to which a failover may occur. Should an N-module fail in pair-wise cluster environments and all of its network portals are moved to a single surviving N-module, the surviving N-module may become overloaded. However, in a N-way cluster, a storage system administrator typically configures the system to ensure that all network portals within a TPG failover to the same surviving N-module. If all network portals within a TPG do not failover to the same N-module, initiators may send data access commands to a network portal residing on an N-module that is different from that of the other network portals of the TPG, thereby resulting in error conditions within the iSCSI session. The present invention is directed to a system and method for ensuring that all network portals within a TPG failover to the same destination (e.g., N-module).