Life cycle data management may be implemented to increase or maximize the value of previously acquired data and ongoing data collection. Various life cycle data management schemes impose documented decision paths for regulatory review and legal protection. Life cycle data management imposes severe demands for data archival that become increasingly difficult as data set sizes grow. While tape backup is possible but increasingly costly for restoration within an overnight time window, faster response is demanded in many situations and conditions.
As the size of disk drives increases and the demand for large data sets grows, a virtualizing disk controller can become a performance and availability bottleneck. Large pools of physical disk storage are served to growing clusters of client hosts through single or dual disk controllers. The controllers have a bandwidth limited by a maximum of several Peripheral Component Interface eXpress (PCI-X) buses. Furthermore, the controller's mean time before failure (MTBF) performance is lagging data availability imposed by the upward scaling of data set size and client workload.
Several techniques have been used to address mapping limitations on physical disk space for virtualizing controllers. For example, increasing virtualization grain size has been attempted to allow more physical disk space to be mapped without increasing the amount of random access memory, a technique that suffers from poor performance of snapshots on random write workloads.
Adding more ports to disk controllers increases bandwidth, but the industry is now at the limit of fan-out for a multiple-drop bus such as PCI-X. Therefore, the addition of more ports often is attained at the expense of a slowed clock-rate, limiting the potential increase in bandwidth.
Disk controllers have contained the metadata for Redundant Array of Independent Disks (RAID) and virtualization constructs, thereby coupling the disk controllers to the data served by the controllers. Accordingly, disk replacement becomes complicated and data migration prevented.
Dual controller arrangements are commonly used to address mean time before failure (MTBF) and data availability limitations. Dual controller arrangements are typically tightly-coupled pairs with mirrored write-back caches. Extending beyond a pair becomes an intractable control problem for managing the mirrored cache. Pairing the controllers roughly squares the hardware MTBF at the expense of common-mode software problems that become significant in a tightly-coupled controller architecture.