A storage server operates on behalf of one or more clients on a network to store and manage data in a set of mass storage devices. A plurality of storage server nodes can be interconnected to provide a storage server environment configured to service many clients. Each storage server node can be configured to service one or more volumes, wherein each volume stores one or more data containers (e.g., files, logical units, etc.). The volumes can be organized as a striped volume set (SVS) and data containers can be striped among the volumes in the SVS so that differing portions of the data container can be stored at different volumes. This, in turn, distributes the data access requests, along with the processing resources needed to service such requests, among all of the storage server volumes.
A file system logically organizes file-related information (such as directories, data containers and blocks) in a tree structure. A directory is an array, where each element in the array is a pair of mappings between an identifier of a data container and its inode number (i.e., an index of an inode data structure storing metadata attributes of the data container. Some of the common operations performed against directories are create (making a new entry in the directory), unlink (removing an existing entry), lookup (finding the inode number that matches a data container name) and readdir (enumerating all entries in the directory).
When a data container is created within a parent directory, an entry is stored within the parent directory. The entry represents a mapping between an identifier of the data container, such as a name, and its inode number. The “data container identifier” as used herein can refer to the identifier of a data container, a directory, or subdirectory. Typically, all directory entries are stored on a single volume, which is serviced by one storage server node. The single volume provides all information necessary for resolving a path that includes multiple levels of directory names. A path, as used herein, is a general form of an identifier of a data container or of a directory name, which specifies its unique location in a file system.
When a great number of access requests are issued against a directory serviced by one storage node, a bottleneck can be created. To resolve the bottleneck, NetApp, Inc.'s new technology stripes identifier-to-inode number mappings across a striped volume set. To stripe a directory, some of the directory's data container identifier-to-inode number mappings are stored on one volume while other entries are stored in another volume. The distribution of the mappings within the volumes can be based, for example, on a hash value of the data container name. Thus, the mappings necessary for providing a path to a data container may be distributed across multiple volumes. The hash value is used to identify the volume on which the mapping is stored.
Typically, the path leading to a data container or a directory has multiple path components from the root to the data container or directory itself. The path components (such as for example directory names) can change during the life cycle of a data container. Such a change needs to be reflected in subsequent path construction when the path of a data container needs to be reported to an application that has registered to receive updates about a data container. To construct a path to a data container or a directory, various path components are traversed in the hierarchical directory structure by identifying inode numbers of the path components. Since in a striped directory inode-to-identifier mappings can be stored on different volumes, a storage server node may need to communicate with more than one other node during the path construction process. Some identifiers (e.g., names of the directories near the root level) may need to be resolved repeatedly. Obtaining individual path components may result in issuing multiple remote procedure commands by storage server nodes to different volumes. This increased inter-node communication may reduce the system performance.
Accordingly, what is needed is a mechanism for reducing the path component lookup through remote procedure commands in a storage server in which identifier-to-inode number mappings are striped across volumes.