The invention relates generally to network management, and more particularly to remote alarm threshold and status monitoring of network elements.
Network management systems are employed to monitor, interpret, and control the operations of a network. In a typical network management system, network devices (e.g., servers, gateways, hosts) are provided with agent software (an "agent") that monitors and accumulates operational data and detects exceptional events. A management station includes management software (a "manager") at the application level which requests operational data or receives event notifications from the agent using management protocols. The management station is further equipped to interpret the operational data and event information to effect control of the network operations.
Simple Network Management Protocol (SNMP) (J. Case et al., "A Simple Network Management Protocol", RFC 1157, May 1990) defines a standard protocol for the communication of management information. SNMP specifies the format and meaning of messages exchanged between managers and agents and the representation of names and values in those messages. A virtual information store, termed a Management Information Base (MIB) (K. McCloghrie and M. Rose, "Management Information Base for Network Management of TCP/IP-based Internets", RFC 1156, May 1990), defines the management data or objects by specifying the data variables a network device must keep, including the names of the data variables and the syntax used to express those names. For example, the MIB may specify data variables which track statistics on the status of network interfaces, incoming and outgoing datagrams, and the number of routing failures. The rules used to define and identify MIB data variables are provided by the Structure of Management Information (SMI) specification (M. Rose and K. McCloghrie, "Structure and Identification of Management Information for TCP/IP-based Internets", RFC 1155, May 1990). Each managed data variable or object has a name known as an object identifier which specifies an object type. The object type together with an object instance uniquely identifies a specific instantiation of the object. For convenience, a text string known as an object descriptor is used to refer to the object type.
SNMP uses a fetch-store paradigm to effect operations between a manager and agent. Specifically, SNMP defines get-request, get-next-request, get-response, and set-request commands which provide the basic fetch and store operations. In addition, SNMP defines a trap command by which an agent asynchronously sends information to a manager triggered by an event. Thus, a management station requests operational data or receives event notifications by means of this simple set of SNMP commands.
A limitation of known MIB structures is that trap definitions are predefined in the MIB. For any particular MIB, traps are defined which trigger when specific conditions are met. Since the traps are predefined at the time the MIB is designed, a network management station typically must poll the network device for values of MIB variables not specified in a defined trap. Polling across the network is undesirable since it adds to network traffic. To provide for more extensive monitoring of events without having to poll, a MIB could be designed to include additional predefined trap definitions, one for each combination of variables. However, for large MIBs, the number of variables needed can be prohibitive for this approach. Further, since the traps are predefined, a user of the management station does not have the option of turning certain traps on and off.
An improvement on predefined MIBs is the Remote MONitor (RMON) MIB (S. Waldbusser, "Remote Network Monitoring Management Information Base", RFC 1757, February 1995) which provides a way to define traps. The RMON MIB includes an alarm group of objects which periodically takes statistical samples from data variables and compares them with preconfigured thresholds. An alarm table in the RMON MIB specifies configuration entries that each define a variable and associated threshold parameters. An event is generated and a trap is sent when it is determined that a sample has crossed a threshold value. Two thresholds are provided: a rising threshold and a falling threshold. The rising threshold is crossed if the value of the current sample is greater than or equal to the rising threshold. Likewise, a falling threshold is crossed if the current sample value is less than or equal to the falling threshold. To limit the generation of traps, the RMON MIB includes a hysteresis mechanism. According to the hysteresis mechanism, one trap is sent as a threshold is crossed in the appropriate direction. No additional traps are sent for that threshold until the opposite threshold is crossed.
However, the RMON MIB is limited in the variables that it supports and the conditions that it can specify for generating a trap. While the RMON MIB supports thresholding on statistical values (e.g., the number of packets or collisions counted for a monitored interface), it has limitations on trapping multistate or enumerated status variables. For example, a status variable for a device typically may have multiple states defined, such as "unknown", "running", "warning", "testing", and "down". Each state is represented in the MIB object syntax by a different integer value, e.g., unknown=1, running=2, warning=3, testing=4, and down=5. Since the RMON MIB only supports thresholding using "greater than or equals to" and "less than or equals to" comparisons, thresholding for a particular state of a multistate status variable is difficult and cumbersome.