The growth in digital information is accelerating. An increasing number of disk drives are required to store that information. Because disk drives are essential for the existence of digital information, disk drive reliability analysis, which is a general term for the monitoring and “learning” process of disk drive prior-to-failure patterns, is a highly explored domain both in academia and in industry.
For example, disk drive manufacturers equip both hard disk drives (HDD) and solid state drives (SSD) with Self-Monitoring, Analysis and Reporting Technology (SMART), an industry standard technology for detecting and reporting indicators of drive reliability. Examples of commonly used indicators, referred to as SMART attributes, include reallocation sectors count, reported uncorrectable errors, power-on hours, read-write errors, and so forth.
SMART attributes' thresholds, which are the attributes' values that should not be exceeded under normal operation, are set individually by manufacturers by means that are often considered a trade secret. Since there are more than one-hundred SMART attributes whose interpretation is not always consistent across vendors, rule-based learning of disk drive failure patterns is quite complicated and cumbersome. The reactive nature of rule-based learning limits the accuracy and timeliness (i.e. how far in advance of an actual failure) of any rule-based failure prediction.
To improve failure prediction, other approaches to using SMART attributes have been developed using statistical and machine learning models, but the results have been mixed. Some machine learning models have succeeded in improving the prediction of failure rates of drives in general, but accurate prediction of failures of individual drives is still elusive. Other challenges, such as reducing the number of false positives, remain to be overcome.
The inability to accurately predict drive failures increases the likelihood of data loss or interruption for customers that rely on digital information. Furthermore, it hampers storage provider's efforts to devise an optimized drive replacement strategy that could reduce costs for both storage providers and customers alike.