1. Field of the Invention
The present invention relates, in general, to the field of computers and computer storage devices. More particularly, the present invention relates to notifying application programs in a computing system of changes in the state of storage devices or metadevices of the computing system.
2. Description of Prior Art
Conventional computing systems used for data processing and storage of large amounts of data typically utilize numerous physical devices, such as disk drives, for storage of information. To reduce the complexity of access to the storage devices, the physical storage devices are often arranged into metadevices or logical devices.
Physical disk drives can fail in numerous ways through, for instance, a disk controller failure, a disk head failure, a disk platter failure, a failure in the cable connecting the disk drive to the computing system, etc. While some of the failures of a disk drive can be recovered from, other failures require that the storage device be removed from the computing system and repaired.
In order to improve the integrity of data storage in these computing systems during a disk failure, a variety of data replication techniques have been developed. RAID (redundant array of inexpensive disks) disk arrays including disk mirrors (RAID-1), disk stripes (RAID-0), and RAID-5 arrays, as well as disk sets, concatenated devices, and spare disks can all be used to enhance the reliability of information storage and retrieval in a computing system. For example, a simple disk mirror is comprised of two disks, each disk having the same data stored therein. If one of the disks in the mirror fails, then the other disk is used to satisfy a read or write request.
Conventional data replication methods generally mask the failure of any single physical disk from the computing system because data errors resulting from disk failures are automatically corrected before any erroneous data is passed to the application programs. Because conventional replication techniques automatically correct and mask any physical disk failure from the computing system, user-level applications running on the computing system are generally unaware that a physical disk of the computing system has failed.
Importantly, as the number of storage device errors increases in a computing system, the more likely the computing system will subsequently suffer a catastrophic storage device error resulting in loss of data. Although conventional replication techniques can detect and correct a single disk failure, a subsequent second disk failure generally results in a catastrophic error condition wherein data stored in the replicated storage device is lost and unrecoverable. For instance, a simple disk mirror comprised of two physical disks can withstand a failure in a single disk, but a failure in both disks results in lost data. Likewise, in a RAID-5 disk array, a failure of more than one disk results in lost data.
It is therefore beneficial for the computing system to provide information in real-time regarding the condition of the storage devices. While conventional computing systems may provide a limited amount of disk information to a console connected to the file server, these messages are often directed solely to this console. The console is generally located in a machine room housing the file server and is rarely monitored in real-time by a user or a system administrator.
Furthermore, notification of device errors in conventional computing systems is generally limited to devices failure information.