The present invention concerns monitoring resources on a computer and pertains particularly to monitoring disk storage devices.
In computing systems it is often desirable to monitor the operation and status of various resources within the computing system. Once status changes or particular events are detected, these may be sent to an application or management platform for further processing and/or for being made available to a user.
For example, status changes or particular events detected by a resource monitor may be received by a resource monitoring management program such as the OpenView resource monitoring management program available from Hewlett-Packard Company, having a business address of 3000 Hanover Street, Palo Alto, Calif. 94304, or the MC/ServiceGuard resource monitoring management program, also available from Hewlett-Packard Company. Other existing products such as the High Availability Clustered Multi-Processing (HACMP) control application available from IBM Company.
Information about disk and logical volume status is important in a high availability environment. However, information in existing systems is limited. For example, the Logical Volume Monitor (LVM) product, available with the HP-UX operating system available from Hewlett-Packard Company, manages disk storage devices (disks) in logical volumes. A logical volume is a collection of pieces of disk space from one or more disks. Each collection is put together so that the collection appears to the operating system as a single disk.
Like disks, logical volumes can be used to hold file systems, raw data areas, dump areas, or swap areas. Unlike disks, the size of a logical volume can be chosen when the logical volume is created and a logical volume can later be expanded or reduced. Also, logical volumes can be spread across multiple disks.
A logical volume can exist on only one disk or can reside on portions of many disks. The disk space within a logical volume can be used for swap, dump, raw data or a file system can be created thereon.
The LVM program and disk drivers for individual disks write messages to a system log providing some information ultimately available to a user. For example, HP OpenView IT/Operations program available from Hewlett-Packard Company can be configured to monitor messages from the system log and make this information available to a user. However, there is a lot of desirable information that is not currently expressed to a user in a form that is useful to the user.
For example, within a logical volume, data can be mirrored on two or more individual disks. This means that an exact copy of the data is stored in two or more separate places. This allows for the data to be obtained from a back-up location in the case of an error which prevents the data from being obtained from a primary location. This is done transparent to a user. For example, this functionality is implemented by the MirrorDisk/UX product available from Hewlett-Packard Company. Nevertheless, it may be interesting to some users to know how many copies of particular data are currently available and to know which particular physical devices are currently available.
Because applications cannot detect if a particular disk is down or if a link to the disk is down, for high availability configurations, multiple links to the disk are required even when mirroring is handled by arrays. These multiple links, however consume I/O slots and limit capacity of the computing system.
Within the HP-UX operating system there currently is an enhancement which allows applications operating within the HP-UX environment to set an input/output (I/O) timeout. This is done, for example, by the application setting a time-out. All I/O calls return with an error if the timeout is exceeded. Without this enhancement, an I/O call will hang indefinitely if the disk is down. However, this enhancement requires each application to be modified.