1. Field of the Invention
The present invention relates to a method, system, and program for monitoring system components.
2. Description of the Related Art
Prior art devices provide a monitoring program to monitor the operation of a system. For instance, the Sun Microsystems, Inc. (xe2x80x9cSUNxe2x80x9d) StorEdge Enclosure Manager provides management and monitoring of a SUN A5x00 storage subsystem.** The StorEdge Enclosure Manager provides alarm notification and remote reporting (via email, files, and system logging) upon detection of abnormal activities or conditions within a designated storage enclosure. An alarm provides a notification that signifies that a problem may need to be resolved depending on a detected severity. The StorEdge Enclosure Manager monitors system status information in intervals as part of a xe2x80x9cpollingxe2x80x9d operation. In monitoring specific hardware components, a set of xe2x80x9crulesxe2x80x9d are provided that define the conditions under which a notification or alarm is issued. The alarm or notification may indicate that the status is xe2x80x9cokxe2x80x9d, critical in that one or more critical conditions have been detected, unrecoverable in that one or more unrecoverable conditions have occurred, or unknown.
In the StorEdge Enclosure Manager, a file monitoring class lexically analyzes strings of messages written to an administrative file to which system status information is written. If there is a match between state information in the administrative file and a rule, then the Enclosure Manager may write data to a log file and/or generate an alarm. Some of the system components and resources that may be monitored include the disks, a Gigabit Interface Converter (GBIC) module that converts electrical signals to optical signals, the power supply, system temperature, fan status, loop status of the connection between host and storage system, backplane status, etc. With the prior art Enclosure Manager, the user may specify an e-mail or pager address for remote reporting of alarms, the time interval for polling of resources, etc. The SUN Component Manager provides similar monitoring services for a storage subsystem, and is described in the SUN publication xe2x80x9cSun StorEdge Component Manager 2.0 User""s Guidexe2x80x9d (Copyright SUN, January 2000).
The rule system of prior art system monitoring tools, such as those discussed, above, have rules that specify a particular action when a threshold value is reached. Such systems may generate excessive notifications if system resource values are experiencing thrashing, i.e., constantly changing and thereby constantly triggering alarms as the state change passes the threshold value. For instance, the temperature of one or more system components may be monitored and an alarm generated when different threshold temperature values reached. With such systems, alarm notifications may be continually generated if the temperature continues to fluctuate to different threshold values that trigger the alarm.
For the above reasons, it would be desirable to provide a monitoring system that can provide a greater degree of flexibility in monitoring system states to avoid situations where alarms may be excessively generated as measured system parameters continuously fluctuate.
Provided is a method, system, program, and data structure for deriving state information concerning a monitored system component A status object is provided including information on a current state of the monitored system component. There are a plurality of states associated with the monitored system component, wherein each state is capable of having a state action and at least one transition condition associated with a transition state. A measured system parameter is received and a determination is made as to whether the received measured system parameter satisfies one transition condition associated with the current state indicated in the status object. If the received system parameter satisfies one transition condition, then the state action associated with the transition state associated with the satisfied transition condition is performed. The current state is set to the transition state in the status object.
In further implementations, if the transition state associated with the satisfied transition condition is the current state, then a counter is incremented.
Still further, if the transition state is the current state, then a determination is made as to whether a frequency event associated with the transition condition is satisfied. The state action associated with the current state is performed if the associated frequency event was satisfied.
Further provided is a method, system, program, and data structure for implementing a state machine to monitor a system component. A state class and status object class are provided. A status object is instantiated from the status object class, wherein the status object includes a current state variable indicating a current sate of the state machine. Multiple states of the state machine are instantiated from the state machine class, wherein each state is capable of having a state action and notification performed when transitioning to the state from another state. At least one evaluation function is generated for each state, wherein each evaluation function determines whether an operation on a measured system component satisfies a condition. A transition state is associated with each evaluation function. The status object is updated to indicate the transition state as the current state if the associated evaluation function determines that the condition is satisfied.