A network management system (NMS) collects enterprise-wide event information from multiple network data sources and presents a simplified view of this information to end users. Referring to FIG. 1 the NMS manages the event information for: assignment to operators; passing on to helpdesk systems based on a relational database management system (RDBMS); logging in a database such as a helpdesk customer relationship management system (CRM); replicating on a remote service level management system; and triggering automatic responses to certain alerts. An NMS also consolidates information from different domain limited network management platforms in remote locations. By working in conjunction with existing management systems and applications, the NMS minimizes deployment time and enables employees to use their existing network management skills.
One enterprise NMS uses a scalable system of more than one periodically replicating ObjectServer to handle enterprise event management. An ObjectServer can collect, aggregate and display events independently or in combination with other ObjectServer and such independent configurability allows high scalability. FIG. 1 shows a known NMS of one ObjectServer managing the events with an additional ObjectServer in a failover configuration. FIG. 2 shows a known NMS of three layers of ObjectServers. It is known to use layer (also called tiered) ObjectServer architecture to coordinate the events where a large number of events impacts the ObjectServer processing capacity. A two layer NMS might have one or more lower ObjectsServers (a collection layer) and an upper ObjectServer (an aggregation and display layer). A three layer NMS (see FIG. 2) has one or more lower ObjectServers (a collection layer); a middle ObjectServer (an aggregation layer) and one or more upper ObjectServers (a display layer). Raw events enter the system and are collected in the collection layer via probes. A probe is a small application process that can read event data feeds from various network devices and elements and write them to an ObjectServer. These events are inserted into a status table of an ObjectServer for processing and forwarding. ObjectServers at different layer are connected together via gateways. It is via these gateways that events are progressed from the lower layers, through the middle layer, to the upper layers.
In an enterprise NMS events are not pushed to the gateway immediately as they occur but are batched up and pulled to a client periodically on a update cycle called a granularity window. The granularity window or period of the update cycle can be changed by the user. In a large and busy system the update cycle period can be set to 30 seconds or more.
A general description of event progression through a three layered NMW is presented below with reference to FIG. 2.
In the collection layer, raw events from a managed network element are received by a probe. Each raw event intercepted by a probe is changed into an ObjectServer event for operation on an ObjectServer status table. An ObjectServer event is one of insertion, deletion, update or cancellation of data in the status table. Clients (both gateway and end user applications) register interest for certain types of ObjectServer event with the ObjectServer. Periodically, and after one or more ObjectServer events operate on the status table, a change message is generated by the ObjectServer and transmitted to all interested clients. For instance, a gateway for an aggregation ObjectServer has registered interest in all ObjectServer events operating on the status table. On receipt of a periodic change message from the ObjectServer, the gateway will request a change reference data set of ObjectServer events for the last period. The change reference data set references the change data but does not comprise the change data. The received change reference data set is used by the client to fetch all or part of the change data (e.g. a complete row from the status table or selected columns from the status table). The gateway may then replicate all changes defined in the change data set on the client database or ObjectServer database, for example, the aggregation ObjectServer.
As events from the gateway are inserted into the aggregation ObjectServer, the aggregation ObjectServer generates a further change event for all interested parties. The event is propagated from the aggregation layer ObjectServer to a display layer ObjectServer via an associated gateway.
In the display layer, event changes in this layer are propagated to the end users' desktops such as a network operator. The propagation from display layer ObjectServers to the desktop is achieved using the same replication strategy as that used by the gateways within the collection and aggregation layers.
Though a layered architecture provides sufficient scalability by the provision of periodic replication acting as a form of load balancing, it does introduce additional delay in event notification to the operator attached to the display layer. In a large deployment of the EMS, the time taken to display critical events to a network operator, from source to the operator front-end is the same as that for any other event in the system. From the perspective of the system, all events had equal status. Assuming a three layer deployment with an update cycle period of 30 seconds at each level of the structure, then an event, once it has entered the system, will be visible to the end user in approximately 90 seconds. In some environments it is desirable to present a subset of events to operaters in a shorter amount of time.
U.S. Pat. No. publication 6,131,112 describes a process called a gateway between a Network Management Platform (NMP) and a System Management System (SMP). The gateway allows for the exchange of event/alarms between the two systems for the benefit of cross-functional correlation. A system is described by which events can be examined for interest via a form of filtering or user defined policy and then passed onwards to a secondary system. All events destined for the secondary system are regarded as equal as they are pushed to the destination system using the same process and path.
US patent publication 2006/0015608, discusses the concept where events from resources which are known to be down due to maintenance are suppressed. System maintenance windows for a resource can be defined, where during this window any failure event related to that resource will be suppressed and ignored during the defined window.
Therefore, when dealing with large amounts of raw events, periodic notification and replication is necessary to organise the raw events and ensure that they are processed and distributed in an efficient manner. The volume of raw events and the periodic notification requires an ordered processing of the events and resulting in all events being processed in approximately the same manner and time. However, there is a need for some events to be processed quicker than the average time.