The need for scaling the capacity, availability, and performance of datasets across multiple direct-access storage devices (DASDs) led to the development of the Redundant Array of Inexpensive (or Independent) Disks (RAID) technology in the early 1980s, and the implementation of storage controllers that offer RAID-based logical disk abstractions. These storage controllers are typically computer servers attached to a large number of DASDs via a peripheral I/O interconnect. They form RAID arrays by combining groups of DASDs and subsequently create and export logical disk abstractions over these RAID arrays. The RAID technology protects against data loss due to DASD failure by replicating data across multiple DASDs and by transparently reconstructing lost data onto spare DASDs in case of failure. Depending on the degree of overall storage controller availability desired (which directly affects cost), storage vendors have several options regarding the reliability and redundancy of components used when designing storage controllers. Besides the reliability of hardware components, the quality of the software that implements failure recovery actions is important to the overall availability level provided by a storage controller. The RAID technology is one of many approaches to using data redundancy to improve the availability, and potentially the performance, of stored data sets. Data redundancy can take multiple forms. Depending on the level of abstraction in the implementation, one can distinguish between block-level redundancy and volume-level replication. Block-level redundancy can be performed using techniques such as block mirroring (RAID Level 5), parity-based protection (RAID Level 10), or erasure coding. See R. Bhagwan et al., “Total Recall: System Support for Automated Availability Management,” in Proc. of USENIX Conference on Networked Systems Design and Implementations '04, San Francisco, Calif., March 2004
Block-level redundancy operates below the storage volume abstraction and is thus transparent to system software layered over that abstraction. In contrast, volume-level replication, which involves maintaining one or more exact replicas of a storage volume, is visible (and thus must be managed) by system software layered over the storage volume abstraction. Known technologies to perform volume-level replication include, e.g., FlashCopy® computer hardware and software for data warehousing, for use in the field of mass data storage, from International Business Machines Corporation, and Peer-to-Peer Remote Copy (PPRC).
Manual availability management in large data centers can be error prone and expensive and is thus not a practical solution RAID (see D. Patterson et al., “A Case for Redundant Arrays of Inexpensive Disks (RAID),” Proceedings ACM SIGMOD, Chicago, June 1988) systems, which employ data redundancy to offer increased availability levels over groups of DASDs, operate in mostly a reactive manner and are typically not goal-oriented. Also, they may not easily extend from single controllers to systems of multiple storage controllers The Change Management with Planning and Scheduling (CHAMPS) system, described in A Keller et al., “The CHAMPS System: Change Management with Planning and Scheduling”, IBM Technical Report 22882, Aug. 25, 2003, is concerned with how a given change (e.g., a software upgrade of a component) in a distributed system affects other system components and on how to efficiently execute such a change by taking advantage of opportunities for parallelism. CHAMPS tracks component dependencies and exploits parallelism in task graph. While representing a substantial advance in the art, CHAMPS may have limitations regarding consideration of service availability and regarding data availability in distributed storage systems.
There is little prior work on automated availability management systems in environments involving multiple, heterogeneous storage controllers. The Hierarchical RAID (HiRAID) system (see S. H Baek et al., “Reliability and Performance of Hierarchical RAID with Multiple Controllers,” in Proc 20th ACM Symposium on Principles of Distributed Computing (PODC 2001), August 2001) proposes layering a RAID abstraction over RAID controllers, and handling change simply by masking failures using RAID techniques. HiRAID may not be optimally goal-oriented and may focus on DASD failures only (i.e., as if DASDs attached to all storage controllers were part of a single DASD pool). It may not take into account the additional complexity and heterogeneity of the storage controllers themselves and thus may not be appropriate in some circumstances.
Other approaches may also inadequately characterize storage controller availability For example, Total Recall (see R Bhagwan et al., “Total Recall: System Support for Automated Availability Management”, in Proc. of USENIX Conference on Networked Systems Design and Implementations '04, San Francisco, Calif., March 2004) characterizes peer-to-peer storage node availability simply based on past behavior and treats all nodes as identical in terms of their availability profiles; it is thus more appropriate for Internet environments, which are characterized by simple storage nodes (e.g., desktop PCs) and large “churn”, i.e., large numbers of nodes going out of service and returning to service at any time, rather than enterprise environments and generally heterogeneous storage controllers. Another related approach applies Decision Analysis theory to the design of archival repositories. See A Crespo and H. Garcia-Molina, “Cost-Driven Design for Archival Repositories,” Proceedings of the 1st ACM/IEEE-CS Joint Conference on Digital Libraries, Roanoke, Va., 2001. This is a simulation-based design framework for evaluating alternatives among a number of possible configurations and choosing the best alternative in terms of reliability and cost. Prior work within this framework, however, has not addressed the heterogeneity and complexity issues in large scale storage systems or the problem of storage volume placement on a set of storage controllers.
Existing provisioning systems such as IBM's Volume Performance Advisor (VPA) take into account capacity and performance considerations primarily when recommending volume allocations. While VPA represented a substantial advance in the art, it may not have appropriate provision for availability goals.
It would thus be desirable to overcome the limitations in previous approaches