A storage system is a computer that provides storage service relating to the organization of information on persistent storage devices, such as memories, tapes or disks. The storage system is commonly deployed within a storage area network (SAN) or a network attached storage (NAS) environment. When used within a NAS environment, the storage system may be embodied as a file server including an operating system that implements a file system to logically organize the information as a structure of directories and files on, e.g. the disks. Each “on-disk” file may be implemented as a set of data structures, e.g., disk blocks, configured to store information, such as the actual data for the file. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored.
The storage system may be further configured to operate according to a client/server model of information delivery to thereby allow many client systems (clients) to access shared resources, such as files, stored on the filer. Sharing of files is a hallmark of a NAS system, which is enabled because of semantic level of access to files and file systems. Storage of information on a NAS system is typically deployed over a computer network comprising a geographically distributed collection of interconnected communication links, such as Ethernet, that allow clients to remotely access the information (files) on the storage system. The clients typically communicate with the storage system by exchanging discrete frames or packets of data according to pre-defined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP).
In the client/server model, the client may comprise an application executing on a computer that “connects” to the storage system over a computer network, such as a point-to-point link, shared local area network, wide area network or virtual private network implemented over a public network, such as the Internet. NAS systems generally utilize file-based access protocols; therefore, each client may request the services of the storage system by issuing file system protocol messages to the file system over the network. By supporting a plurality of file system protocols, such as the conventional Common Internet File System (CIFS), the Network File System (NFS) and the Direct Access File System (DAFS) protocols, the utility of the storage system may be enhanced for networking clients.
A SAN is a high-speed network that enables establishment of direct connections between a storage system and its storage devices. The SAN may thus be viewed as an extension to a storage bus and, as such, an operating system of the storage system enables access to stored information using block-based access protocols over the “extended bus”. In this context, the extended bus is typically embodied as Fibre Channel (FC) or Ethernet media adapted to operate with block access protocols, such as Small Computer Systems Interface (SCSI) protocol encapsulation over FC (FCP) or TCP/IP/Ethernet (iSCSI). A SAN arrangement or deployment allows decoupling of storage from the storage system, such as an application server, and some level of storage sharing at the application server level. There are, however, environments wherein a SAN is dedicated to a single server. When used within a SAN environment, the storage system may be embodied as a storage appliance that manages data access to a set of disks using one or more block-based protocols, such as FCP.
One example of a SAN arrangement, including a storage system suitable for use in the SAN, is described in United States Patent Publication No. 2004/0030668 A1, entitled MULTI-PROTOCOL STORAGE APPLIANCE THAT PROVIDES INTEGRATED SUPPORT FOR FILE AND BLOCK ACCESS PROTOCOLS by Brian Pawlowski et al, published on Feb. 12, 2004.
Storage systems may be arranged in a distributed environment to enable the creation of distributed file systems that cooperate to provide load balancing, disaster recovery, etc. Such storage systems may further provide a unified hierarchical namespace to enable a plurality of independent file systems to be “viewed” by a client as a single entity.
In distributed storage systems, it is often desirable to track and identify the location of a file or other data container among the various storage system members because e.g., a data container on a first storage system contains an indirection construct or “junction” that identifies that a portion of the data container is stored on another storage system. A junction may comprise an indirection construct that identifies that data is located in a location remote from the junction. In such cases, the client is required to reliably resolve the location of the data in order to request access thereto. Such location resolution may involve heterogeneous storage system architectures, e.g., the two storage systems may utilize differing vendors and/or file system implementations.
One example of a distributed storage system is the Andrew File System (AFS) that utilizes a plurality of independent storage servers to implement a plurality of AFS cells. An AFS cell is a set of one or more servers, sharing a common administration, that together implement a specific sub-tree of the AFS namespace. AFS and its architecture are further described in Scale and Performance in a Distributed File System, ACM Transactions on Computer Systems, 6(1):51-81, February 1988. A noted disadvantage of the AFS architecture is that the location resolution system is not fully scalable. Within an AFS environment, each AFS cell maintains information in the form of, e.g., a file, that contains a mapping between all of the AFS cell names and the location for the volume location databases (VLDBs) for each cell. This information may not be updated regularly as updates rely upon system administrators of other cells to forward appropriate information relating to modifications to each cell within a federation. For example, should a VLDB be added, removed and/or migrated from one location to another within a cell, the system administrator for that cell is required to provide the new location information to all other cells within the AFS federation. As the location information may rapidly change, each individual cell's information relating to other cell's VLDB(s) may be constantly “stale” (out of date), thereby causing error conditions due to incorrect location resolutions.
Another example of a distributed storage system is the Distributed File System (DFS), described in U.S. Pat. No. 6,742,035, entitled DIRECTORY-BASED VOLUME LOCATION SERVICE FOR A DISTRIBUTED FILE SYSTEM, by Edward Zayas, et al. A noted disadvantage DFS arises in that the referrer (i.e., the computer attempting to access a resource) is linked into a given VLDB, thereby potentially exposing location information to a broader range of people. Another noted disadvantage is that when the location of a volume changes, the VLDBs for all of the servers that host junctions that refer to that volume must be updated. This adds to the burden of keeping track of all of the referring VLDBs, and requires that each VLDB trust information provided is by servers in a different administrative cell.