A storage server provides access to data that is stored on one or more storage devices connected to the storage server, such as disk drives (“disks”), flash memory, or other storage devices. A storage server may be configured to operate according to a client/server model of information delivery to allow many clients to access data stored on the storage server. In this model, the client may comprise an application executing on a computer that “connects” to the storage server over a computer network, such as a point-to-point link, shared local area network, wide area network or virtual private network implemented over a public network, such as the Internet. A client may access the storage devices by submitting access requests to the storage server, for example, a write “request” to store client data included in a request to storage devices or a “read” request to retrieve client data in the storage devices.
In the operation of a storage server, the storage server includes an operating system that may implement a file system to logically organize information as a hierarchical structure of logical storage units such as directories and files on a storage device (e.g., disks). Each file may be implemented as a set of data structures, e.g., disk blocks, configured to store information, such as the actual data for the file.
In a network storage system, multiple storage servers are networked or otherwise connected together to provide access to data stored on the storage devices connected to the storage servers. In this configuration, a collection of physical storage structures (e.g., a disk, a data block, etc.) or logical storage structures (e.g., a file, a directory, a volume, etc) can be spread across one or more storage servers and each such structure may be individually or in combination referred to as a “storage object”. Storage objects are created by a storage administrator who may also make decisions as to how to protect data in storage objects in the event of data corruption, accidental data deletion, or disk failures.
Protection objectives may be summarized in a protection policy which describes a data redundancy setup. In particular, a protection policy may describe what data to replicate, when to replicate data, what replication techniques to employ, etc. In a data redundancy set-up, storage objects may be organized into one or more logical units, each unit referred to as a “dataset” so a protection policy can be applied to the dataset to configure and manage the underlying dataset resources uniformly. For example, such configuration and management may include storage server operations such as listing storage objects, adding storage objects, generating storage usage reports, and other operations which can be performed on a dataset. Storage objects constituting a dataset participate in effectuating a protection policy based on a storage object configuration which describes the storage object(s) that store client data, storage object(s) that store replicated data, and the type of replication relationship between storage objects.
One conventional technique in data replication is a backup, which may be a read-only, persistent, point-in-time image of data often referred to as a snapshot. In certain instances, a backup may be a copy of data, pointers to data blocks storing client data, or incremental changes to client data. In the case of snapshots, a backup provides the ability to quickly revert the state of data to a known previous point in time by virtue of being a point-in-time image. However, a backup of this type may not be an effective data protection mechanism in disaster recovery since such a backup typically involves only the incremental changes in data since a backup was last created. Thus, if the underlying data has been lost, a point-in-time image cannot serve as a replacement for the underlying data.
A second technique for data replication, mirroring, is therefore preferable in disaster recovery. A mirror provides an actual copy of the underlying data and/or the file system that organizes the data. Mirrored data can therefore be accessed to service client requests if the underlying data is no longer available. However, mirroring requires additional storage for such copied data and thus is less space efficient than backups. As used herein, replicated data is either the backup or mirror of data being protected. Data being protected may either be client data or other replicated data (e.g. a backup to be further replicated).
At certain times, a storage administrator may desire to modify the protection level of data by changing the protection policy. For example, a storage administrator may decide that certain data requires an increased level of protection. The storage administrator may then select a new protection policy corresponding to a higher level of protection (e.g. mirroring instead of backing up) which involves a new storage object configuration. When a storage administrator changes the type of replication involved, typically a new relationship between storage objects must be established. A reason for this is that the underlying physical resources (e.g. storage servers) use specific communication protocols to facilitate the transfer of data between such resources.
A communication protocol includes instructions which direct where data should be transferred and the type of data to transfer. For instance, a Qtree SnapMirror (QSM) protocol (developed by NetApp, Inc. of Sunnyvale, Calif.) facilitates the transfer of backup data between storage servers, whereas a volume SnapMirror (VSM) protocol (also developed by NetApp, Inc. of Sunnyvale, Calif.) facilitates the transfer of mirror data between storage servers. Thus, when the data replication type changes or a new storage object is included in a new protection policy, a new relationship must first be established between storage objects, and then data can be transferred to the new storage object to effectuate the new protection policy.
Conventionally, the storage administrator manually architects a configuration for storage objects so that instead of participating in effectuating the existing protection policy, they may participate in effectuating a new protection policy. In certain cases, additional storage objects may participate in the new configuration. After a new configuration is architected, the storage administrator supplies the new configuration to a storage server via an interface to the network storage system, and causes the storage sever to implement the new policy.
Special attention must be given to the storage object configuration under the new policy, however, since changing a protection policy typically has a large and critical impact on a network storage system's resources and the usage of those resources. In particular, if the storage administrator architects and implements a poorly conceived configuration, protection of client data may be affected, as well as the ability of a network storage system to service client requests. Thus, with conventional approaches, data protection and network storage system performance may depend on the skills and experiences of the storage administrator.
One example of the shortcoming of the manual techniques is that a storage administrator may implement a configuration which replicates data to a storage object with inadequate storage space. In the event of a disaster, data may be irretrievably lost if a mirror or copy of data is not available as a result of inadequate storage space.
The insufficiency of the manual technique is further evident when a storage administrator selects a new protection policy having a mirror instead of a backup. A new relationship must be established between storage objects before data can be mirrored. After the new relationship is established, data can then be copied and transferred from one storage server to another, the process referred to as a “rebaseline.” However, a rebaseline may adversely impact network speed and performance if a large amount of data is copied and transferred across the network between storage objects in remote physical locations. Since a rebaseline may take as long as a period of days or even weeks depending on the amount of data copied and the speed of the network between the storage servers, a storage administrator must undergo careful planning in architecting the storage object configuration for the new protection policy.