1. Field of the Invention
The invention relates to network monitoring, and specifically, to event management.
2. Description of the Prior Art
The purpose of monitoring a network is to manage network performance, discover and solve network problems, and plan for network growth. According to Morris Sloman (Editor), “Network and Distributed Systems Management”, Addison-Wesley, England, 1994, pg. 303, monitoring can be defined as the process of dynamic collection, interpretation, and presenting of information concerning objects or software processes under scrutiny. Monitoring can be used for general network management, such as performance management, configuration management, fault management, or security management. One application of monitoring is event reporting which is explained below using definitions taken from the aforementioned text at pp. 303 to 347.
The network to be monitored is comprised of one or more managed objects. A managed object is defined as any hardware or software component whose behavior can be monitored or controlled by a management system. Hardware components may be hubs, routers, computers, bridges, etc. Each managed object is associated with a status and a set of events. The status of a managed object is a measure of its behavior at a discrete point in time. An event is defined as an atomic entity which reflects a change in the status of the managed object. The behavior of the managed object can be defined and observed in terms of its status and events.
The status of the managed object lasts for a certain time period. Examples of a status are “process is idle” or “process is running”. An event occurs instantaneously. Examples of an event are “message sent” or “process started”. Since the status of a managed object is normally changing continuously, the behavior of the managed object is usually observed in terms of a distinguished subset of events, called events of interest. Events of interest reflect significant changes in the status of the managed object.
In order to monitor the events of interest, events of interest must be detected. An event is said to have occurred when the conditions which are defined by event detection criteria are satisfied. These conditions are detected by appropriate instrumentation, such as software and hardware probes or sensors inserted in the managed object.
Event detection may be internal within or external from the managed object. Internally performed event detection is typically performed as a function of the managed object itself. Externally performed event detection may be carried out by an external agent which receives status reports of the managed object and detects changes in the status of the managed object.
The occurrence of the event may be detected in real-time or delayed. Once the event is detected, an event report is generated at the managed object. The event report may comprise an event identifier, type, priority, time of occurrence, the status of the managed object immediately before and after the occurrence of the event, and other application-specific status variables.
In order to monitor the dynamic behavior of the managed object, the event report may be conveyed from the managed object to a central unit. At the central unit event reports may be gathered, visualized, and recorded. The central unit may be a Network Management Station (NMS) on which an appropriate software, usually called a manager, resides. The manager executes management applications that monitor and control the managed objects. Physically, an NMS, sometimes called a console, is usually an engineering workstation with a fast CPU, megapixel color display, substantial memory, and abundant disk space. The NMS may comprise a database on which incoming reports sent by the managed objects, such as event reports, are stored.
Received reports can be viewed with the Graphical User Interface (GUI) of the NMS.
In order to carry out event detection, each managed object must know its event detection criteria. The event detection criteria for a specific managed object can be defined using an appropriate template. Once this template is created, the relevant managed object or its agent will be configured with that template.
FIGS. 1 to 3 show an example of such a template 1 for a managed object, which is a computer run by the operating system Sun-Solaris. The managed object is monitored by the Network Management System HP OpenView, which monitors its logfile. Template 1 has a name-field 2 for defining template 1. In this case, the name of template 1 is “R0_HS_MST_VB22F_Syslog”. Additionally, template 1 has a description-field 3, in which a short description of the event detection criteria may be written. Template 1 has also a field 4 which specifies the path and the name of the file to be monitored. The name of the logfile is “syslog”. Furthermore, the time period in which the logfile “syslog” is automatically checked by the managed object for a new entry is defined by a field 5 of template 1. In this example, the logfile “syslog” is checked each minute.
The actual event detection criterion or event detection criteria of the managed object are defined utilizing a list 20 which is shown in FIG. 2. For this example, list 20 contains only one event detection criterion which is: “Refused connect from denied node”.
FIG. 3 shows a list 30 which is used to define the message of an event report sent from the managed object to an NMS if an event defined by the event detection criterion occurrs. The message can be written in a message test field 31. For this example, the message of the event report is “Connection refused from <*.node>”, when there is an unauthorized attempt to log on the managed object. “<*.node>” is actually a wildcard, which is replaced by the actual system's name from which the unauthorized log on was attempted.
Usually a network contains different types of managed objects. Those different types of managed objects may be different types of computer controlled devices or apparatuses, such as magnetic resonance or computed tomography apparatuses. Furthermore, events of interest are normally different for each of the different types of managed objects, resulting in the development of different templates which comprise event detection criteria specific to the different types of managed objects. For example, an event of interest specific to the magnetic resonance apparatuses may be a failure of one of their high frequency components, while an event of interest specific to the computed tomography apparatuses may be a problem associated with their x-ray generating components.
In addition, a template related to a type of a managed object may be modified over time, because a user monitoring the network may be interested in a modified set of events of interest involving that type of managed object. Then, not only a new template for that type of a managed object has to be developed and tested, but the managed objects of that type have to be reconfigured with the newly developed template, and also the modifications have to be recorded.
If the network is comprised of relatively many different types of managed objects and their event detection criteria are frequently modified, then administrating the process of developing and testing the new template, reconfiguring the relevant managed objects, and reporting the modifications may be particularly cumbersome.