This application relates generally to critical event logging techniques and the critical events that are useful for performing disc drive failure analysis in real time without a host computer intervention, and more particularly critical events are disc drive operational events, errors, and other information that are useful for disc drive failure analysis.
Disc drives are data storage devices that store digital data in magnetic form on a rotating storage medium called a disc. Modem disc drives comprise one or more rigid discs that are coated with a magnetizable medium and mounted on the hub of a spindle motor for rotation at a constant high speed. Each surface of a disc is divided into several thousand tracks that are tightly-packed concentric circles similar in layout to the annual growth rings of a tree. The tracks are typically numbered starting from zero at the track located outermost the disc and increasing for tracks located closer to the center of the disc. Each track is further broken down into sectors and servo bursts. A sector is normally the smallest individually addressable unit of information stored in a disc drive and typically holds 512 bytes of information plus a few additional bytes for internal drive control and error detection and correction. This organization of data allows for easy access to any part of the discs. A servo burst is a particular magnetic signature on a track, which facilitates positioning of heads over tracks.
Generally, each of the multiple discs in a disc drive has associated with it two heads (one adjacent the top surface of the disc, and another adjacent the bottom) for reading and writing data to a sector. A typical disc drive has two or three discs. This usually means there are four or six heads in a disc drive carried by a set of actuator arms. Data is accessed by moving the heads from the inner to outer part of the disc (and vice-versa) driven by an actuator assembly. The heads that access sectors on discs are locked together on the actuator assembly. For this reason, all the heads move in and out together and are always physically located at the same track number (e.g., it is impossible to have one head at track 0 and another at track 500). Because all the heads move together, each of the tracks on all discs is known as a cylinder for reasons that these tracks form a cylinder since they are equal-sized circles stacked one on top of the other in space. So, for example, if a disc drive has four discs, it would normally have eight heads, and a cylinder number 680 would be made up of a set of eight tracks, one per disc surface, at track number 680. Thus, for most purposes, there is not much difference between tracks and cylinders since a cylinder is basically a set of all tracks whereat all the heads are currently located.
As with any data storage and retrieval, data integrity is critical. Oftentimes, for various reasons such as defective media, improper head positioning, extraneous particles between the head and media, or marginally functioning components, disc drives may record or read data incorrectly to or from the disc. For reasons such as predicting imminent disc drive failure, disc drive testing, and evolutionary disc drive improvement, it is valuable to characterize a disc drive""s operating parameters; it is particularly useful to characterize unsuccessful reads and writes.
Disc drives will inevitably fail at the end of a long period of normal operations. As a result, the associated PC system will be down while the disc drive is replaced. Additionally, the disc drive failure may cause the loss of some or all of the data stored in the disc drive. While much of the data stored in the failed disc drive may be recoverable, the recovery of such data may be both costly and time consuming.
Disc drives may fail suddenly and unpredictably during a normal operation or may fail due to gradual decay of disc drive components after a long period of normal operations. To this end, the industry recognized Self Monitoring Analysis and Reporting Technology (SMART) feature was developed. SMART is an effective tool for predicting the disc drive failure due to gradual decay of the disc drive components. SMART is essentially a self-contained disc drive monitoring system that measures, records, and analyzes various operating metrics of a disc drive. Most of SMART feature resides in a disc drive firmware. In order to access data collected by SMART, the host executes command data set defined by the disc drive interface standard such as Advanced Technology Attachment (ATA) interface standard, which is also known as Integrated Device Electronics (IDE) interface.
The host computer, however, does perform a lot of interpretation of data collected by SMART. That is, the host may perform simple operations such as retrieve SMART data and perform simple comparisons, but almost all of the brain power that maintains, updates, SMART feature is in the disc drive firmware and the controller themselves.
SMART was initially designed and developed primarily for predicting disc drive failures. As a result, the data collected by SMART was inadequate to conduct a successful disc drive failure analysis. SMART was focused on predicting disc drive failures and collecting relevant information prior to a disc drive failure. More specifically, the data collected by SMART did not contain enough details needed for conducting a successful failure analysis. The data collected by SMART was inadequate for analyzing the root cause of the failure for an already failed disc drive. That is, SMART data did not provide a complete history of important disc drive operational events while the disc drive was in normal operation with the host computer. By understanding a history of the failed disc drive operations, the failure analysis can be performed more quickly and efficiently.
Accordingly there is a need for techniques that allows a disc drive to log critical events that are useful for conducting a failure analysis of the disc drive. The critical events are interesting disc drive operational events, errors, and other information that can show the operational history of the disc drive operations prior to the failure.
Against this backdrop an embodiment of the present invention has been developed. An embodiment of the invention described monitors and logs critical events to a critical event log stored in a critical event log storage area on a disc in a disc drive. The disc drive having a data storage disc is operably connectable to a host computer. The data communication between the host computer and the disc drive is established via a disc drive interface. The disc drive interface may be an ATA disc drive interface. A portion of the data storage disc is a critical event log storage area. A power-on operational status of the disc drive with the host computer is determined. Then a critical event is determined without a host computer intervention. The critical event is predefined information related to disc drive operation. The list of the critical events and the programming for the critical event logging are stored in the firmware of the disc drive. The determined critical event occurrence is stored on the critical event log storage area on the disc. The critical event can be monitored and logged either during an on-line data collection mode or an off-line data collection mode. During the off-line data collection mode, the firmware performs off-line scan of the disc drive in the background. The determined critical event occurrence is then logged to the critical event log by reading the critical event log from the critical event log storage area on the disc; appending the determined critical event to the critical event log; and storing the appended critical event log to the critical event log storage area on the disc. These and various other features as well as advantages which characterize the present invention will be apparent from a reading of the following detailed description and a review of the associated drawings.