1. Field of the Invention
This invention relates to predicting storage device failure and more particularly relates to setting a predictive failure threshold for predicting failure in response to a technology descriptor.
2. Description of the Related Art
A data storage system typically comprises a plurality of storage devices such as hard disk drives, optical storage drives, magnetic tape drives, micromechanical devices, semiconductor devices, and the like. The data storage system may provide data storage for one or more hosts. Each host may store data to or retrieve data from the data storage system over a communications medium such as a network, an internal bus, or the like. The data storage system may store the data to or retrieve the data from one or more storage devices. Storage devices may be added to or removed from the data storage system to provide sufficient data storage capacity for the hosts.
The data storage system may be organized to store data redundantly. For example, the data storage system may maintain a copy of or mirror data from a first storage device on a second storage device. The mirrored data may be accessed from the second storage device if the first storage device fails.
The data storage system may also be organized as a redundant array of independent disks (“RAID”) system as is well known to those skilled in the art. In a RAID data storage system, data may be stored in stripes across a plurality of storage devices in a redundant form. If one of the plurality of storage devices fails, the data storage system may recover the data from the other storage devices.
The data storage system may attempt to determine if a storage device is likely to fail so that proactive actions may be taken to protect the data. For example, the data storage system may periodically test each storage device. If a storage device satisfies failure criteria, the data storage system may migrate data from the storage device and notify an administrator that the storage device should be replaced or subjected to further tests.
Data storage systems have typically employed high-reliability, high-cost (“HRHC”) storage devices, but recently data storage systems are also employing high-capacity, low-cost data (“HCLC”) storage devices. Unfortunately, a HCLC storage device may be have the potential to fail when exhibiting a different set of criteria from an HRHC storage device. Yet if a HCLC storage device satisfies HRHC failure criteria when the HCLC storage device is not satisfying failure criteria for the HCLC storage device, the data storage system may still recognize a potential failure of the HCLC storage device. As a result the data storage system may migrate data from the HCLC storage device and take the HCLC storage device offline. Unfortunately, migrating data from the HCLC storage device may impact the performance of the data storage system, while taking the HCLC storage device offline for maintenance or replacement increases maintenance costs.
In addition, the HCLC storage device may be workload managed to reduce the stress to the HCLC storage device. For example, the duty cycle of the HCLC storage device may be reduced to prevent excessive wear to the HCLC storage device. The duty cycle may be the percentage of time the storage device is executing an operation as is well known to those skilled in the art. Unfortunately, the workload managed HCLC storage device may be more likely to satisfy failure criteria. If the data storage system recognizes a potential failure in the workload managed HCLC storage device, the data storage system may migrate data from the workload managed HCLC storage device and take the HCLC storage device offline, reducing data storage system performance and increasing maintenance costs.
From the foregoing discussion, it should be apparent that a need exists for an apparatus, system, and method that predicts storage device failures based on the characteristics of the storage device. Beneficially, such an apparatus, system, and method would predict failure based on the technology of each storage device, and reduce the number of operational storage devices erroneously removed from a data storage system.