As data management applications become more advanced and complicated, the provisioning of storage for the applications in a network storage system becomes increasingly difficult. Conventionally, storage “provisioning” is the allocation of data storage resources to satisfy storage-centric metrics such as data integrity, data redundancy and data availability. Typically, a storage administrator allocates storage resources based upon the needs of an application running on a client system that uses disk space. A storage administrator has to make numerous decisions, such as how to monitor the available space for the storage object, how to schedule data backups, how to configure backups, whether the data should be mirrored, where data should be mirrored, etc. Answers to the above questions may be summarized in a data management policy. In general, a data management policy includes a description of the desired behavior of the associated data set. For instance, a data management policy may describe how the storage should be used and configured. One exemplary data management policy is a data protection policy, which describes how storage objects in a data set should be protected. Attributes associated with a data management policy are generally specified at the highest node possible. Additionally, when there are changes, such as in the characteristics of an application, the characteristics of the storage server, the capability of the storage devices, the network topology or the availability of storage in a data center, the administrator may need to revise earlier decisions on storage allocations and take necessary corrective action, such as migrating existing data from existing disks to new disks. The changing of earlier decisions of storage allocation may be complex, time-consuming and error-prone, and may result in storage server unavailability.
Frequently, storage administrators do not have the tools or knowledge to process information such as the changes described above and to take the appropriate actions. Further, even if the tools are available to take corrective actions, the effort and cost involved may deter the administrators from taking the actions. Thus, sub-optimal usage of storage capacity often results and the performance of an application is subsequently degraded by storage space limitations or data throughput bottlenecks, for example.
Another issue that storage administrators face is the heterogeneity of storage interfaces for each of the different types of storage devices that are deployed in the storage system. For example, a single storage system may include storage devices with different storage interfaces. Each of the storage interfaces has different capabilities that make it very difficult, if not impossible, for storage administrators to configure storage in a consistent and uniform manner. U.S. Pat. No. 6,801,992 describes creating storage provisioning policies by specifying storage heuristics for storage attributes using storage heuristic metadata. As used here, storage heuristic is a generalized rule or algorithm, derived from experience, which expresses a relationship between a storage attribute and a performance metric of the storage system. Storage attributes characterize a storage device (e.g., capacity, latency, transfer rate, etc.) and storage heuristic metadata describe how to specify a storage heuristic. Using the storage heuristic metadata, storage heuristics are defined to express a rule or constraint as a function of a discoverable (e.g., software discoverable) storage attribute. A storage profile is a collection of storage heuristics. By including specific storage heuristics in a storage profile, only the storage devices that meet the heuristics are provisioned.
While this approach provides a solution for static storage management, it does not satisfactorily address the integration and dynamic management of data and storage. Contemporary storage systems separate the management of data (e.g., file systems and structured data such as databases) from the management of the underlying physical disks, aggregates and logical units numbers (LUNs) used to store the data. Data administrators are concerned with the redundancy, performance, persistence and availability of their data. Storage administrators are focused on delivering physical infrastructure that satisfies the data's storage requirements. Typically, the storage resources are provisioned and then, within the constraints of the provisioned storage resources, data management takes place. If the data management needs require that the storage resources be re-provisioned, the process can be very disruptive, perhaps involving many domain specific administrators who must work closely together to manage the changes.
Modern consolidated single-architecture storage systems (such as the FlexVol™ flexible volume technology available from Network Appliance, Inc. of Sunnyvale, Calif.) provide virtualized features such as space-efficient (e.g., write-anywhere) data replication that result in a storage infrastructure that eliminates much of the incompatibility and inflexibility found in other storage environments. However, even in such environments, data management and storage management remain separate disciplines. For example, a storage administrator may manage flexible volumes and aggregates for provisioning storage using data mirroring tools to make copies of the flexible volumes. A data administrator, on the other hand, manages files and structured data abstracted from physical storage and thinks in terms of copying those files and structured data without regard to the underlying physical storage.
Mapping the data management requirements to the storage management requirements typically needs human interaction and complex processes that are error-prone. Moreover, as storage systems grow larger and more complex, the task of provisioning a storage system to meet data management requirements increasingly challenges human capability. Additionally, over time, the load imposed on provisioned storage changes, the relative costs of different storage system resources changes and the relative risk of failure of different storage system resources changes. As a result, the state of a current allocation of resources of a storage system drifts away from the goals specified by any initial provisioning decisions.