Data storage systems may comprise one or more drives connected to one or more drive controllers that are connected to a host or network interface. Each component of the storage system, such as drives, controllers, connectors, and wiring are a potential point of failure in the system. Some systems, such as personal computers, for example, may lose access to data in the event of a failure of a controller, bus, or connector. Access to data may require that a failed component be repaired or replaced or that a drive be installed in another system to access data. Failure of a drive usually results in loss of stored data. Larger storage systems may employ redundancy methods such as RAID to distribute data across a plurality of drives such that data is not lost in the event of a drive failure. In a RAID system, data from the failed drive may be copied from a mirror drive, or the data may be reconstructed from data and parity information on functioning drives. After the failure of a drive or controller, the system may often operate in a reduced performance condition until failed components are replaced or repaired. Failure of a bus may require removal of drives and installation of the drives in another fixture or system in order to access data.
The level of fault tolerance, storage capacity, operating life, and data availability are key contributors to the value of a storage system. Fault tolerance may be expressed in terms of the number of failures (both sequential and simultaneous) of drives, controllers, and buses that may be incurred while still maintaining data integrity and data access. Storage capacity reflects the number of drives, capacity of each drive, and data encoding methods used. As the number of drives increases, the number of interconnections and likelihood of failure increases. Storage system operating life is reflected in the longevity of components and level of fault tolerance of the system. Spare drives may be employed to store copied or reconstructed data to extend operation of the system after the failure of a drive. Data availability may be expressed in terms of data transfer rates, fault tolerance, and system performance following failure of one or more components.
The commercial viability of a storage system reflects the architectural decisions and component selections made by the designer to provide a desired level of fault tolerance, storage capacity, operating life, and data availability. Components with very long MTBF (mean time between failure) ratings may adversely affect system cost.