1. Technical Field of the Invention
The present application is related to computer networks. More specifically, the present application is related to a system, apparatus and method for handling multiple instances of events while avoiding duplication of work in a distributed network environment.
2. Background of the Invention
The distributed nature of computer networks presents various challenges for their centralized management. One such challenge is event or alarm management and processing. In a typical network environment, distributed network nodes typically notify the network's central network management server software application of any changes in the state of the network or of individual nodes. In general, the network management application or software may be run on one network server or simultaneously on a plurality of such servers. Such network management applications typically represent a single point, management interface for network administration.
Among the events or alarms typically monitored in a distributed network are distributed and node-specific events. In general, distributed events are those events that may affect the network as a whole. One example of a distributed event is the removal of a device port's entry from an associated Distributed Name Server. Such an event is considered a distributed event because it affects the Distributed Name Server on all of the network's Fibre Channel switches, for example.
Node-specific events, on the other hand, are typically concerned only with the state of an individual node. One example of a node-specific event is a FAN_FAILURE alarm. A FAN_FAILURE alarm is considered a node-specific event because it does not generally affect any nodes in the network other than the node where it originates.
Network management difficulties arise when the same distributed event is sent to the network management application by multiple nodes. If the network management application handles or processes each instance of the reported event without distinguishing whether each event is a different event or multiple copies of the same event, the network management application may suffer performance degradation resulting from double-handling, i.e., the repeated processing or addressing of the same events. Double-handling is typically most dangerous in situations where the network management application handles or processes events based on certain assumptions regarding the current state of the computer network. In other words, when the network management application receives a subsequent copy of the same event, the state of the network may have already been changed as a result of the network management application's handling of the previously reported event. At a minimum, double-handling consumes resources as the network management application attempts to repeatedly handle or process the same event.
Attempts to resolve the issue of double-handling include giving the multiple copies of the same event the same identity tag. In such an implementation, when the network management application receives notification of events, the network management application will begin by examining the identity tags. By examining the identity tags, the network management application can group those events with the same identity tags together, thereby enabling the network management application to handle or process the same event only once.
In reality, however, identity tags are impractical to implement. In one aspect, the need for the nodes to communicate with each other to agree on the identity tag every time they are going to send a notice of an event results in excessive network overhead. In a further aspect, the network management application generally has to keep a history of all the received tags in order to perform tag association.