The present invention relates to alarm notification in a communications network and more specifically to a method and apparatus for receiving alarms from multiple network management servers, applying policies to those alarms and forwarding the alarms that conform to the policies to one or more network management applications.
Spectrum(trademark) is a model-based network management system, sold by Cabletron Systems, Inc., Rochester, N.H., for maintaining and processing information pertaining to the condition of a communications network and providing the same to a user. For example, Spectrum(trademark) will periodically poll a network device to request information, such as the number of packets sent on the network in a given time and the number of errors that occurred. If the error rate is above a predetermined limit, an error alarm is logged in the Spectrum(trademark) database, an alarm sent to the user interface to notify the network manager, and a message is sent to shut off the corresponding network device.
Alternatively, if no response was received from the network device when it was polled, the reason for the loss of contact should be determined so that appropriate action, such as a service call, can be taken. In a network environment, loss of contact with a network device may be due to failure of that network device or to failure of another network device that is involved in the transmission of a message.
In many prior art network management systems, the network administrator was typically provided with a list of possible causes of a fault and was required to isolate the fault based on his experience and knowledge of the network. In Spectrum(trademark), the system itself isolates network defaults using a technique known as Status Suppression. Spectrum(trademark) maintains a database of models for each network device. When contact between a model and its corresponding network device is lost, the model sets a fault status and initiates the fault isolation technique. The model (first model) which lost contact with its corresponding network device (first network device) determines whether adjacent models have lost contact with their corresponding network devices; adjacent network devices are defined as those which are directly connected to a specified network device. If adjacent models cannot contact the corresponding network devices, then the first network device cannot be the cause of the fault, and its fault status in the first model will be overriden. By suppressing the fault status of the network devices which are determined not to be defective, the defective network device can be identified. Once the fault has been isolated, the condition of the defective device can be updated in the Spectrum(trademark) database, a control message can be sent shutting off the defective device, and the network administrator can be notified via the user interface.
Spectrum(trademark)""s associated SpectroGRAPH(trademark) user interface provides a graphical view into the network models. An alarm log view, shown in FIG. 1, includes an area 120 for the listing of current alarms, and an area 122 for displaying information pertaining to a selected alarm. The user may click on a particular alarm in the listing of current alarms to obtain more information. A multi-function icon 124 representing the network device having a fault is displayed in area 122, with one or more text fields 126 and 128 which provide information to the user regarding the cause of the alarm and the status of the device. By clicking on specified areas of the icon 124, the user can obtain further information regarding the device for which an alarm is registered.
Another method for fault management in large communications networks is to use a so-called xe2x80x9ctrouble-ticketingxe2x80x9d system. This system provides a number of tools that can be used by network users, administrators, and repair and maintenance personnel. The basic data structure, a xe2x80x9ctrouble-ticketxe2x80x9d, has a number of fields in which a user can enter data describing the parameters of an observed network fault. A trouble-ticket filled out by a user may then be transmitted by, for example, an electronic mail system to maintenance and repair personnel. A trouble-ticket describing a current network fault that needs to be acted on is called xe2x80x9can outstanding trouble-ticketxe2x80x9d. When the network fault has been corrected, the solution to the problem, typically called a xe2x80x9cresolutionxe2x80x9d is entered into an appropriate data field in the trouble-ticket and the trouble-ticket is said to be completed. The system provides for storage of completed trouble-tickets in memory and thus a library of such tickets is created, allowing users, administrators, and maintenance and repair personnel to refer to the stored completed trouble-tickets for assistance in determining solutions to future network faults. An example of a trouble-ticketing system is the ACTION REQUEST system, developed by Remedy Corporation, Mountain View, Calif., and sold by Cabletron Systems, Inc., Rochester, N.H.
ARS Gateway(trademark) is a network management application sold by Cabletron Systems, Inc. which receives fault information from the Spectrum(trademark) system and automatically generates a trouble-ticket that may be processed by the ACTION REQUEST system. This system is further described in copending and commonly owned U.S. Ser. No. 08/023,972 filed Feb. 26, 1993 by Lundy Lewis, and entitled xe2x80x9cMethod and Apparatus For Resolving Faults In Communications Networks,xe2x80x9d and which is hereby incorporated by reference in its entirety.
The Spectrum(trademark) system is described in U.S. Pat. No. 5,261,044 issued Nov. 9, 1993 to Roger Dev et al., which is hereby incorporated by reference in its entirety. The Spectrum(trademark) network management system is commercially available and also described in various user manuals and literature available from Cabletron Systems, Inc., Rochester, N.H.
Other network management platforms and applications for the basic filtering of alarms which are commercially available include: (1) HP OpenView, 3000 Hanover Street, Palto, Calif. 94304; (2) LattisNet, SynOptics Communications, 4401 Great American Pkwy., Santa Clara, Calif. 95054; (3) IBM Netview/6000, IBM Corp., Old Orchard Road, Armonk, N.Y. 10504; and (4) SunNet Manager, SunConnect, 2550 Garcia Ave, Mountain View, Calif. 94043.
Unfortunately, in the prior art systems alarms can only be received from one network management server. Also there is no provision for applying the same policy-based filter to multiple network management applications.
Thus, it is an object of the present invention to provide greater control over which alarms get reported to network management applications and to provide a means to ensure consistency of reported alarms across multiple network management applications.
The present invention is directed to an apparatus and method of alarm notification, which includes: (a) receiving alarms from multiple network management servers; (b) assigning policy-based filters to associated network management applications; and (c) applying the assigned policy-based filters to the alarms and for the alarms that pass the filters, generating an alarm notification forwarding the same to the associated network management applications.
In an embodiment described herein, a user designates a plurality of such filters, which constitute an alarm notification policy, to one or more associated network management applications. The policy-based filters are stored in a database, and a tag is assigned for identifying each filter. The same filters may be assigned to multiple applications.
In a further embodiment, the user may schedule the assignment of such policy-based filters to occur at a designated time in the future. For example, a user may pick a policy from a list of available policies to associate with a selected application, and then designate the frequency with which the policy is applied, e.g., once, hourly, daily, weekly or monthly.
Furthermore, the invention can be used in the same mode as similar tools in the prior art, i.e., with one alarm-forwarding component for each network management system/network management application pair, or alternatively as a single entity in a distributed network management environment.
These and other features of the present invention will be more fully described in the following detailed description and figures.