1. Field of the Invention
The present invention relates to disk drives for computer systems. More particularly, the present invention relates to using a genetic algorithm to select a subset of quality metrics as input to a disk drive failure prediction algorithm.
2. Description of the Prior Art
Disk drive failure prediction is used primarily to detect marginal disk drives that are progressing toward catastrophic failure while in-the-field so that a warning can be issued to backup the disk drive and/or return the disk drive to the manufacture for repair. Another suggested application for drive failure prediction is to screen out marginal disk drives from the manufacturing line to avoid the expensive (time consuming) final testing stage. The performance of the drive failure prediction algorithm depends on the ability to correctly identifying marginal disk drives while minimizing the number of good disk drives falsely detected as marginal disk drives (false alarms). In the case of field failures minimizing false alarms reduces the cost associated with returning a good disk drive to the manufacture for repairs, and in the case of manufacturing failures minimizing false alarms improves yield.
A well known drive failure prediction system for detecting marginal disk drives while in-the-field is the Self Monitoring and Reporting Technology (SMART). The SMART system implemented internal to each disk drive monitors a number of quality metrics (e.g., head fly height, retries, servo errors, etc.) indicative of impending failure. If any one of the quality metrics exceeds a predetermined threshold (OR operation), a failure warning is issued. However, issuing a failure warning if any one of the quality metrics exceeds a threshold provides sub-optimal prediction performance since it does not take into account the correlations of the quality metrics. For example, if the head fly-height falls below a predetermined threshold indicating a potential failure but the other quality metrics do not confirm the failure, a false failure warning may be issued and the disk drive returned unnecessarily to the manufacture for repairs.
Various prior art references have suggested ways for improving drive failure prediction by combining the quality metrics into a composite score that provides a better indication of impending failure. For example, an article by G. F. Hughes, J. F. Murray, K. Kreutz-Delgado, and C. Elkan entitled “Improved Disk Drive Failure Warnings”, IEEE Transactions on Reliability, September 2002, suggests to evaluate each quality metric statistically using a rank sum test, and to combine the rank sums generated for each quality metric into a composite score representing an overall quality of the disk drive. U.S. Pat. No. 5,737,519 suggests to evaluate the quality metrics using a pattern recognition system together with a fuzzy inferencing system in order to detect patterns of quality metrics that are indicative of failure as opposed to evaluating each quality metric independently. U.S. Pat. No. 6,574,754 suggests evaluate the quality metrics using a neural network to generate a composite score and to adapt the neural network using data obtained from field experience.
Although the above references suggest different techniques for improving the drive failure prediction algorithm by evaluating the quality metrics in context rather than independently, the references do not adequately address the need to first identify the quality metrics that are the best indicators of impending failure. For example, the SMART ATA specification generates up to 30 different quality metrics but only a small subset of these quality metrics may be good failure predictors, and the subset of good failure predictors may change with each new generation of disk drives. Identifying the subset of quality metrics that are indicative of impending failure improves the performance and simplifies the implementation of the drive failure prediction algorithm.
There is, therefore, a need for a reliable method of identify a subset of quality metrics for input into a drive failure prediction algorithm.