1. Field of the Invention
The present invention is directed to storage systems and, in particular, to organizing data containers of a storage system into multiple related name spaces.
2. Background Information
A storage system typically comprises one or more storage devices into which information may be entered, and from which information may be obtained, as desired. The storage system includes a storage operating system that functionally organizes the system by, inter alia, invoking storage operations in support of a storage service implemented by the system. The storage system generally provides its storage service through the execution of software modules, such as processes. The storage system may be implemented in accordance with a variety of storage architectures including, but not limited to, a network-attached storage environment, a storage area network and a disk assembly directly attached to a client or host computer. The storage devices are typically disk drives organized as a disk array, wherein the term “disk” commonly describes a self-contained rotating magnetic media storage device. The term disk in this context is synonymous with hard disk drive (HDD) or direct access storage device (DASD).
The storage operating system of the storage system may implement a high-level module, such as a file system, to logically organize the information as a hierarchical structure of data containers, such as files and logical units stored on volumes. For example, each “on-disk” file may be implemented as set of data structures, i.e., disk blocks, configured to store information, such as the actual data for the file. These data blocks are organized within a volume block number (vbn) space that is maintained by the file system. The file system may also assign each data block in the file a corresponding “file offset” or file block number (fbn). The file system typically assigns sequences of fbns on a per-file basis, whereas vbns are assigned over a larger volume address space. The file system organizes the data blocks within the vbn space as a “logical volume”; each logical volume (hereinafter “volume”) may be, although is not necessarily, associated with its own file system.
The storage system may be further configured to operate according to a client/server model of information delivery to thereby allow many clients to access data is containers stored on the system. In this model, the storage system may be embodied as a file server executing an operating system, such as the Microsoft® Windows™ operating system (hereinafter “Windows operating system”). Furthermore, the client may comprise an application, such as a database application, executing on a computer that “connects” to the storage system over a computer network, such as a point-to-point link, shared local area network (LAN), wide area network (WAN), or virtual private network (VPN) implemented over a public network such as the Internet. Each client may request the services of the storage system by issuing file-based and block-based protocol messages (in the form of packets) to the system over the network. By supporting a plurality of storage (e.g., file-based) access protocols, such as the conventional Common Internet File System (CIFS) and the Network File System (NFS) protocols, the utility of the server is enhanced.
A plurality of storage systems may be interconnected to provide a storage system environment, e.g., a storage system cluster, configured to service many clients. Each storage system may be configured to service one or more volumes of the cluster, wherein each volume comprises a collection of physical storage disks cooperating to define an overall logical arrangement of vbn space on the volume(s). The disks within a volume/file system are typically organized as one or more groups, wherein each group may be operated as a Redundant Array of Independent (or Inexpensive) Disks (RAID).
To facilitate client access to the information stored on the server, the Windows operating system typically exports units of storage, e.g., (CIFS) shares. As used herein, a share is equivalent to a mount point or shared storage resource, such as a folder or directory that stores information about files or other directories served by the file server. A Windows client may access information in the directory by mounting the share and issuing a CIFS protocol access request that specifies a uniform naming convention (UNC) path to the share. The UNC path or pathname is an aspect of a Windows networking environment that defines a way for a client to refer to a unit of storage on a server. The UNC pathname specifies resource names on a network. For example, a UNC pathname may comprise a server name, a share (directory) name and a path descriptor that collectively reference a unit of storage or share. Thus, in order to access the share, the client typically requires knowledge of the specific physical location (i.e., the identity) of the server exporting the share.
Instead of requiring the client to provide the specific identity of the file server exporting the unit of storage, it is desirable to only require a logical pathname to that storage unit. That is, it is desirable to provide the client with a globally unique pathname to the storage (location) without reference to the file server. The conventional Distributed File System (DFS) namespace service is well known to provide such a solution in a Windows environment through the creation of a namespace that removes the specificity of server identity. As used herein, a namespace is a view of shared storage resources (such as shares) from the perspective of a client. The DFS namespace service is generally implemented using one or more DFS servers and distributed components in a network.
Using the DFS service, it is possible to create a unique pathname (in the form of a UNC pathname) for a storage resource that a DFS server translates to an actual location of the resource in the network. However, in addition to the DFS namespace provided by is the Windows operating system, there are many other namespace services provided by various operating system platforms, including the NFS namespace provided by the conventional Unix® operating system. Each service constructs a namespace to facilitate management of information using a layer of indirection between a file server and client accessing a shared storage resource on the server. For example, a storage resource may be connected or “linked” to a link point (link in DFS terminology or a mount point in NFS terminology) to hide the machine specific reference to the resource. By referencing the link point, the client can automatically access information on the storage resource of the specific machine. This allows an administrator to store the information on any server in the network by merely providing a reference to the information.
The Virtual File Manager (VFM™) developed by NuView, Inc. and available from Network Appliance, Inc., (“NetApp”) provides a namespace service that supports various protocols operating on various file server platforms, such as NetApp filers and DFS servers. The VFM namespace service is well-known and described in VFM™ (Virtual File Manager) Reference Guide, Version 4.0, 2001-2003, and VFM™ (Virtual File Manager) Getting Started Guide, Version 4.0, 2001-2003.
In a storage system cluster environment, a clustered namespace may be implemented with multiple namespaces such that the clustered environment can be shared among multiple clients. When a request is made to access data stored in the cluster, one or more unique identifiers (such as volume identifiers, etc.) of the clustered namespace identify the storage locations originally used to store the data. The unique identifiers are organized within a storage location repository that is replicated throughout the cluster. The unique identifier contained in a data access request often may not correctly identify the storage location of the data, for example, if data has been moved by an administrator. In that case, a redirection identifier is used to indicate that the requested data is not stored in the storage location identified by the unique identifier provided in the data access request. In response to encountering the redirection identifier during the data access request, the storage location repository is examined to find the correct storage location of the data. Thus, instead of explicitly managing a chain of identifiers to multiple storage locations, a system administrator can use redirection identifiers to indicate that the replication storage location repository should be examined. This, in turn, enables the administrator to update the unique identifiers in a central (yet replicated) repository instead of employing the difficult and time-consuming administration task of updating chains of identifiers.
A junction is an exemplary redirection identifier associated with a storage location that indicates that the data is not stored at the originally used location but is available at some other storage location. Junctions can be “mounted” during volume creation by the invocation of a management command from a command line interface (CLI), graphical user interface (GUI), or the like. For example, the command may be “create a volume and mount it on the namespace /a/b/c,” wherein the namespace “/a/b/c” comprises pathname components, such as parent directory “a” and sub-directory “b,” followed by junction component, “c.” Thus, when searching for a “file” in the namespace “/a/b/c/file,” the junction at the volume containing the component of the pathname “c” is a hint that the file is located on another volume, potentially on a different storage system of the cluster. The new volume identifier can be recorded in the storage location repository.
Certain constraints have heretofore been applied to namespace architectures to ensure that a volume can be located unambiguously in a namespace hierarchy of, e.g., a storage system cluster. For example, if a parent volume in the cluster were to appear in multiple places in the namespace hierarchy, a client could not perform a lookup operation to ascend from a child volume to the parent volume because it would be ambiguous as to which parent volume namespace it should ascend. In order to allow the client to unambiguously determine the parent volume of any child volume, namespaces in the storage system cluster have heretofore been deliberately limited such that a volume can not appear in more than one storage system namespace. Namespaces have heretofore also been constrained such that a volume can not appear in more than one location in a namespace. These constraints can be disadvantageous in certain applications such as, for example, wherein volumes configured for multiple purposes could be more efficiently accessed if they are allowed to reside in a plurality of namespaces.