The Simple Network Management Protocol (SNMP) has achieved widespread acceptance for managing computer-based devices and systems connected by a network (e.g., the Internet). SNMP is a network management standard that defines a strategy for managing TCP/IP and Internet Packet Exchange (IPX) networks. A conventional distributed SNMP architecture may include multiple managed nodes (e.g., a host or a virtual machine), each with an SNMP entity called an agent which provides remote access to management instrumentation. A conventional SNMP architecture further includes at least one SNMP entity referred to as a manager which runs management applications to monitor and control the managed nodes/elements. Managed elements are devices such as hosts, routers, virtual machines, etc. They are monitored and controlled by accessing their management information. A management protocol, SNMP, is used to convey management information between the managers and agents. Management information refers to a collection of managed objects that reside in a virtual information store called a Management Information Base (MIB). Collections of related managed objects are defined in specific MIB modules. The MIB contains the information requested by the management system. The MIB for a networked computer may include, for example, information on the configuration and performance of the network interface card, the available hard drive space, the version of drivers and applications, and so on. Additional MIBs may be written and loaded, to expose the data that is specified for collection, as long as the system itself supports the collection of the requested information.
Each SNMP element manages specific objects with each object having specific characteristics. A managed object is a characteristic of something that can be managed. For example, a list of currently active TCP circuits in a particular host computer is a managed object. Managed objects differ from variables, which are particular object instances. An object instance may be, for example, a single active TCP circuit in a particular host computer. Managed objects can be scalar (defining a single object instance) or tabular (defining multiple, related instances).
SNMP employs five basic messages (GET, GET-NEXT, GET-RESPONSE, SET, and TRAP) to communicate between an SNMP manager and an SNMP agent. The GET and GET-NEXT messages permit the SNMP manager to request information for a specific variable from the SNMP agent. The SNMP agent, upon receiving a GET or GET-NEXT message, may issue a GET-RESPONSE message to the SNMP manager with either the information requested or an error indication as to why the request cannot be processed. A SET message permits the SNMP manager to request that a change be made to the value of a specific variable. The SNMP agent may then respond with a GET-RESPONSE message indicating the change has been made or an error indication as to why the change cannot be made.
The SNMP TRAP message permits the SNMP agent to spontaneously inform the SNMP manager of an event. An SNMP trap is an unsolicited (asynchronous) message that an agent sends to a SNMP management system when it detects a certain type of event has occurred locally on the managed host. Events may include alarms, a change of configuration, such as an addition of a new host, a user defined threshold crossing, such as exceeding a specified amount of virtual memory usage in a virtual machine, and so on.
A corresponding network management protocol in the Windows world is known as Windows Management Instrumentation (WMI). WMI is a set of specifications from Microsoft for consolidating the management of devices and applications in a network from Windows computing systems. WMI also includes a type of asynchronous notification message called a WMI trap, which has a format and structure that is similar to and maps directly to SNMP traps.
Trap-directed notification is employed in SNMP or WMI to reduce notification traffic congestion. If the SNMP manager is responsible for a large number of physical element(s), and each physical element has a large number of objects, it is impractical for the SNMP or WMI manager to poll or request information from every object of every physical element. The solution is for each SNMP agent on the managed physical element to notify the SNMP manager without solicitation. It does this by sending an SNMP trap message. After the SNMP manager receives the SNMP trap message, the SNMP manager transmits the SNMP trap message to the human network manager for display and may choose to take an action based on the fields included in the SNMP trap message. For instance, the SNMP manager may poll the SNMP agent directly, or poll other associated device agents to obtain a better understanding of the event.
Trap-directed notification may result in substantial savings of network and agent resources by eliminating the need for frivolous SNMP requests. However, it is not possible to totally eliminate SNMP polling. SNMP requests are required for discovery and topology changes. In addition, a managed device agent can not send a trap if the device has had a catastrophic outage. Moreover, when an SNMP element includes a plurality of virtual machines, each with its own virtual agent and virtual MIB, a notification traffic congestion problem may occur. A distributed managed network may include a large number of virtual machines per host with each virtual machine generating traps as well as the host. As a result, the SNMP manager(s) as well as the network may be overwhelmed with SNMP or WMI trap traffic, potentially slowing or halting the SNMP manager(s) and the network.