The present invention relates to event relationship analysis in fault management, and more specifically, to relationship strength analysis for event grouping.
Data center and network management disciplines to date have focused extensively on fault and root cause analysis processes, tools and best practices. When events occur in a data center, a notification is sent to an event manager (for example, such as IBM's Netcool OMNIbus or Netcool Operations Insight (NOI), where Netcool, OMNIbus and IBM are trademarks of International Business Machines Corporation, Armonk, N.Y.).
At the event manager, the event may be de-duplicated, correlated, and enriched. It may be handled via a policy enforced by a rules engine. It may be used to automatically create a ticket for a help desk. Events and tickets are the backbone of fault management. Meeting the requirement to reduce operation cost and hence increase return on investment, correlation of commonly co-occurring alerts together allow the operator to only work on one problem or one ticket open for an single problem.
Event manager products deploy grouping mechanisms to find recurrent patterns in the event stream so that when operators are presented with a set of incoming events the list is compacted as much as possible with already observed relations. This provides immense value as it directly reduces cost in a very measurable way for customers.
The grouping of events based on faults in the system data is not trivial. To facilitate nuances in how events arrive at a common collection point, an event grouping mechanism may already requires considerable injection of domain knowledge about the typical behavior of probes and event types commonly found.