A networked storage system may include one or more storage servers, which may be storage appliances. A storage server may provide services related to the organization of data on mass storage devices, such as disks. Some of these storage servers are commonly referred to as filers or file servers. An example of such a storage server is any of the Filer products made by Network Appliance, Inc. in Sunnyvale, Calif. The storage appliance may be implemented with a special-purpose computer or a general-purpose computer. Depending on the application, various networked storage systems may include different numbers of storage servers.
In some existing systems, in order to provide higher availability of storage server services, two storage servers may be utilized to operate as a clustered storage server system. Specifically, each storage server in a clustered storage server system (sometimes referred to as nodes or cluster partners) can take over another storage server in the event of a failover situation. The mode of operation where requests directed to one cluster partner are serviced by the other cluster partner when the other cluster partner is in a failure state or off line is referred to as a takeover mode. In the takeover mode, input/output (I/O) traffic can continue as if the off-line partner storage server still exists and functions normally. In order to start operating in the takeover mode, the storage server has to mount the volumes of its partner storage server. A volume is a logical data set, which is an abstraction of physical storage, combining one or more physical mass storage devices (e.g., disks) or parts thereof into a single logical storage object, and which is managed as a single administrative unit, such as a single file system.
During the mount process, meta-data about each volume is retrieved from the disk subsystem. In some systems, the metadata that is accessed in a volume's mount path comprises many blocks, which may require many disk I/Os in order to mount the volume.
The takeover mode may be terminated when the partner storage server that has been experiencing failure is brought back on line. The storage server that was operating in a takeover mode may be requested (e.g., by a command issued by an administrator) to relinquish control over mass storage devices designated as serviceable by the other storage server partner. Such operation is known as a giveback operation. The other storage server partner then needs to mount all its volumes, which includes accessing on-disk metadata that is in each of its volume's mount path.
As the number of volumes that may be hosted by storage servers increases, the number of disk I/Os necessary for mounting the storage server's volumes also increases. Therefore, the time required to complete takeover and giveback operations in a clustered storage server system increases as the number of volumes increases for each partner storage server, because during takeover and giveback transitions a node has to mount the volumes of its partner.