Personal computers (PC's) have become increasingly more powerful during recent years and are utilized for a variety of applications in industry, business and education. Such varied uses result in different requirements for various subsystems that form the PC. As applications become more complex, the storage requirements for PC's increase. Thus, it is now common for PC's to include hard disks having a storage capacity of as much as 60 gigabytes or more and capacities continue to increase. Recently, disk drives have been employed in a variety of secondary devices other than PCs including consumer electronic devices, medical devices, industrial devices, scientific devices and military devices. These devices typically employ miniature disks having a form factor of about 2.5 cm (1 inch).
Information is stored on disks in a plurality of concentric circular tracks by an array of transducers, or heads (usually one per disk surface) mounted for movement to an electronically controlled actuator mechanism. The storing of information on the disks is sometimes also referred to as “writing”, and the subsequent retrieval of information from the disks is also called “reading”.
Over time, hard disks tend to develop a number of defects. Some defects are attributable to user manageable causes such as radiation, temperature, moisture, pressure, impact and vibration. Other defects are attributable to mechanical failure of one or more components of the disk drive assembly such as the spindle, the arm and other mechanical components.
Currently, there are computer programs for testing computer peripheral storage media, particularly rotating magnetic storage media, to determine whether there are areas that are bad or marginal with respect to storing data with integrity. Many of these programs accomplish the task by writing and reading areas of a storage medium repeatedly to determine the reliability of these areas. If an area does not meet some selected threshold of reliability, then the area is marked bad and data is relocated if possible. These programs are designed to test the disk drive prior to sale of the disk drive and/or prior to incorporating the disk drive into the computer system. These programs tend to be customized for a particular make and model of disk and are not typically generically applicable.
U.S. Pat. No. 5,422,890 discloses a system and method that captures and characterizes error information during disk tests. The system is capable of dynamically determining whether the disk under test has exceeded acceptable error rates based on an actual number of bytes read. The system saves error log information, including specific sector addresses, error rates, error types and data patterns. This system is sometimes referred to as a software-only monitor.
Other software-only monitors are known. However, they are limited to timing signals between a host microprocessor and the drive controller. These signals are predominantly sensitive to variations of disk rotation speed that, because of their high regulation, do not furnish any practical early warning of trouble. When the disk spindle has serious bearing wear or lack of lubricant, the drive controller increases power to overcome the resultant mechanical grinding. As a result, disk failure is hastened in a manner that is not readily detectable.
In the manufacture of disk drives, it is not unusual for tens of thousands of disk drive units to be fabricated daily. With such high numbers of disk drives being made, it is apparent that a certain number of units will fail to meet the design specifications, due to faulty components, improper assembly, contamination, and other elements familiar to those of skill in the art. While every effort is made by disk drive manufacturers to minimize these defective units and assembly errors, a small percentage of defective units will inevitably be built. When the defect is introduced into the unit at an early stage in the manufacturing process, the fault may not be detected until a much later stage of the process. Such a delay in the detection of defective assemblies can result in a significant amount of labor costs when taken over the large numbers of units being manufactured.
U.S. Pat. No. 5,557,183 discloses a method and apparatus for predicting failure of a disk drive based upon electrical power consumption. This system is capable of determining when a disk drive may fail and entrap the stored data. Like other patents that detect dynamic anomalies as opposed to media failures, it requires new hardware and embedded code added to the disk drive during the manufacturing process (at the factory).
Another example of the “factory-installed” approach to disk drive failure prediction is S.M.A.R.T. (Self-Monitoring Analysis and Reporting Technology) technology. S.M.A.R.T. is a technology, implemented in microcode, that is designed to enable a hard drive to predict impending catastrophic failure. It has become a standard covering sensing and reporting of hard drive dynamic performance. It is a combination of Compaq's Intellisafe and IBMs Predictive Failure Analysis (PFA). One of the drawbacks to S.M.A.R.T. is that special, customized hardware is needed to allow users to effectively employ S.M.A.R.T.
Declining disk drive costs reduce the need for sophisticated evidence before making a disk drive replacement decision. When S.M.A.R.T. was originally conceived, disk drive storage was relatively expensive and a decision to replace a suspect disk drive required detailed evidence of potential failure. The cost of disk drive storage has dramatically fallen since the development of S.M.A.R.T. and continues to decline steadily.
IBM has also received several patents relating to prediction of drive failure. U.S. Pat. No. 5,410,439 describes a device that generates predictions of drive failure based on head/disk clearance or flying height.
U.S. Pat. No. 5,539,592 describes a device that measures torque at the actuator motor or change in speed of the spindle motor. Those measurements are compared to historical data taken from a healthy drive to predict failure.
U.S. Pat. No. 5,612,845 describes a device for predicting spindle motor failure. This device uses “readback” signals to detect the existence and magnitude of spindle motor bearing assembly degradation. The patent defines a “readback ” signal as a signal generated by magnetic transitions developed on the disk as the read element passes over the disk tracks.
The forgoing known methods of predicting disk drive failure using factory installed components are disadvantageous for a number or reasons. One problem is that there is a high cost of operation. The drive assemblies require additional hardware, which necessarily increases drive costs at a time when the drive industry is suffering strong price erosion due to vigorous competition. Another problem is that the factory-installed approach has limited application. Drives already shipped cannot be tested without a return trip to the factory. Thus, absent an industry wide agreement, competitive drives cannot be monitored against each other. Still another problem is that there is an increased risk of error due to the possibility of failure of the additional hardware. A further problem is that the factory-installed systems are difficult to maintain because when there is a sensor or other hardware problem, the drive must be sent back to the factory. Yet another problem is that smaller drives do not have room for additional hardware, e.g., sensors. In addition, smaller drives cannot dissipate heat created by additional hardware. Still a further problem is that some systems require specially-formatted dedicated test tracks in order for testing to be performed.
Accordingly, there is a need for a generic disk failure prediction system that overcomes the above-mentioned problems and provides a reliable indication of the state of the disk and alerts appropriate personnel when the disk becomes faulty.