A storage system is a processing system adapted to store and retrieve data on behalf of one or more client processing systems (“clients”) in response to external input/output (I/O) requests received from clients. Data storage space has one or more storage “volumes” comprising of a cluster of physical storage disks, defining an overall logical arrangement of storage space.
A storage system may implement a high-level module, such as a file system, to logically organize the information stored on volumes as a hierarchical structure of data containers, such as files and logical units. For example, each “on-disk” file may be implemented as set of data structures, i.e., disk blocks, configured to store information, such as the actual data for the data container.
To distribute the load for a single data container (such as a file or a logical unit) among a plurality of storage systems, a storage system may employ a striping technique so that a data container is striped (or apportioned) across a plurality of volumes configured as a striped volume set (SVS). In a SVS, each volume is serviced by a different storage system. A technique for data container striping is described in a commonly-owned U.S. Pat. No. 7,698,289, issued Apr. 13, 2010, entitled “STORAGE SYSTEM ARCHITECTURE FOR STRIPING DATA CONTAINER CONTENT ACROSS VOLUMES OF A CLUSTER”, which disclosure is incorporated herein.
Referring now to FIG. 1, storage system architecture comprises a plurality of nodes 200 interconnected as a cluster 100. Each node services a volume (volumes are not shown in FIG. 1). Each node represents a storage system. One or more volumes are distributed across the cluster 100 and are organized as a striped volume set (SVS) (as shown in FIG. 4). The volumes are configured to store content of data containers, such as files and logical units, served by the cluster in response to multi-protocol data access requests issued by clients (e.g., client 180). An exemplary striped volume set 400 is shown in FIG. 4. The SVS includes one metadata volume (MDV) (e.g., 460) storing frequently changed data (e.g., metadata), which includes information describing data containers, including, for example, access control lists and directories, associated with all data containers stored on the SVS) as well as a central directory that stores mappings between a name of a data container and an inode data number. An inode number refers to an index of an inode data structure. An inode data structure contains metadata about a data container. This mapping is typically consulted during such a common operation as a lookup request. For example, given a name of a data container, MDV is searched to find the name of a data container and then an inode number associated with that name.
A storage system in the cluster (e.g., cluster 100) supports many incoming requests. Common operations supported by a storage system include a request to create a data container, delete a data container, perform a lookup using a data container name in a central directory (e.g., a data structure that stores mappings between a name of a data container and an associated inode number on MDV), get metadata of a data container, read from a data container, and write to a data container. The lookup operation is particularly important since it is performed before several other operations. For example, prior to attempting to create a new data container with a data container name, a client issues a lookup request to determine if a data container name already exists. Once the data container is created, a lookup request is issued to find the created data container within a directory in order to issue multiple write calls against a data container handle to populate the new data container with data. As described above, during a lookup operation, the mapping between the names of data containers and inode numbers is consulted to determine an inode number associated with a particular data container name. Since the mappings for all data container names are conventionally stored on a single central directory node in a cluster, this approach to storing mappings introduces a potentially significant bottleneck resulting in all common directory operations, such as “read directory”, “lookup”, “create”, “delete”, and “rename” being serviced by the same node. This arrangement places a substantial load on a central directory node serving the MDV. Since a lookup operation can be performed more frequently than other operations, optimizing the lookup operation is important.