1. Field of the Invention
This invention relates generally to communications networks, and more particularly, to communications networks having multiple domains, each of which may cause intra-domain alarms. These intra-domain alarms may be correlated to provide inter-domain alarms and to facilitate more effective user notification and corrective action.
2. Discussion of the Related Art
Computer networks are widely used to provide increased computing power, sharing of resources and communication between users. Networks may include a number of computer devices within a room, building or site that are connected by a high-speed local data link such as token ring, Ethernet, or the like. Local area networks (LAN's ) in different locations may be interconnected by for example packet switches, microwave links and satellite links to form a wide area network (WAN). A network may include several hundred or more connected devices, distributed across several geographical locations and belonging to several organizations.
Many existing networks are so large that a network administrator will partition the network into multiple domains for ease of management. There are various types of domains. One example is based on geographical location. For example, a company may own or manage a network that includes a first domain geographically located in a first city and a second domain geographically located in a second city, as well as other domains disposed in other geographical locations.
Another domain type is based on organization or departments, e.g., accounting, engineering, sales, etc. A company may have a computer network spanning multiple organizations and multiple geographical locations, but there may not be a one-to-one mapping of organizations to geographical locations. Thus, a first organization and a second organization may both share network resources within first and second geographical locations. For purposes of network accounting (e.g., to allocate network charges to the appropriate organization) or for other reasons, it may be advantageous to consider the network resources of the first organization as being a separate domain from the network resources of the second organization.
A third example of a domain type is a grouping based upon functional characteristics of network resources. For example, one functional domain may be considered to be network resources belonging to a company that are provided for performing computer-aided design, which may draw upon common databases and have similar network traffic. Another functional domain may be network resources of the same company that are provided for financial analysis, which may be resources specially adapted to provide financial data. The network resources of these two domains may be distributed across several geographical locations and several organizations of the company. However, it may be desirable for a network administrator to group the computer-aided design network resources into one domain and to group the financial analysis network resources into another domain. Additional examples of communication network domains also exist, and a single company or organization may have domains that fall into several categories.
The above examples were discussed with respect to one company owning and managing its own network. Similar situations exist for any entity that manages and/or owns a network, for example a service company that provides network management services to several companies.
In the operation and maintenance of computer networks a number of issues arise, including traffic overload on parts of the network, optimum placement and interconnection of network resources, security, isolation of network faults, and the like. These issues become increasingly complex and difficult to understand and manage as the network becomes larger and more complex. For example, if a network device is not sending messages, it may be difficult to determine whether the fault is in the device itself, a data communication link, or an intermediate network device between the sending and receiving devices.
Network management systems are intended to resolve such issues. Older management systems typically operated by collecting large volumes of information which then required evaluation by a network administrator, and thus placed a tremendous burden on and required a highly-skilled network administrator.
Newer network management systems systematize the knowledge of the networking expert such that common problems of a single domain (i.e., a portion of the network under common management) can be detected, isolated and repaired, either automatically or with the involvement of less-skilled personnel. Such a system typically includes a graphical representation of that portion of the network being monitored by the system. Alarms are generated to inform an external entity that an event has occurred or requires attention. Since a large network may have many such events occurring simultaneously, some network management systems provide alarm filtering (i.e., only certain events generate an alarm).
Commercially available network management systems and applications for alarm filtering include: (1) SPECTRUM.RTM., Cabletron Systems, Inc., 35 Industrial Way, Rochester, N.H. 03867; (2) HP OpenView, Hewlett Packard Corp., 3000 Hanover Street, Palo Alto, Calif. 94304; (3) LattisNet, Bay Networks, 4401 Great American Pkwy., Santa Clara, Calif. 95054; (4) IBM Netview/6000, IBM Corp., Old Orchard Road, Armonk, N.Y. 10504; (5) SunNet Manager, SunConnect, 2550 Garcia Ave, Mountain View, Calif. 94043; and (6) NerveCenter, NetLabs Inc., 4920 El Camino Real, Los Altos Calif. 94022.
However, in each instance the existing network management system manages only a single domain. For example, a company having a network consisting of several domains will typically purchase one copy of a network management system for each domain. Each copy of the network management system may be referred to as an instance. Thus, in the functional domain example described above, a first instance of a network management system may manage the computer-aided design domain, while a second instance of a network management system may manage the financial analysis domain. Each instance of the network management system receives information only from the resources of a single respective domain, and generates alarms that are specific only to the single respective domain. Such alarms may be referred to as intra-domain alarms.
Because each instance of a network management system manages only one domain, there is currently no diagnosis or management which takes into account the relationships among multiple domains. Since domains may be interconnected, an intra-domain alarm might be generated for a first domain, even though the event or fault that is causing the intra-domain alarm may be contained within the network resources of a different domain. For example, a first domain in a network may include a router that forwards network traffic to a resource in a second domain. If the router fails or begins to degrade, the performance of the second domain may appear sluggish (e.g., excessive delays, low throughput), even though the network resources within the second domain are operating correctly. This sluggishness may cause an alarm to be generated from the instance of the network management system that manages the second domain. However, no alarm relating to this situation has been generated by the first instance of the network management system that manages the first domain, because there is no performance degradation within the first domain. It is currently necessary to apply human intervention and human reasoning to resolve such a situation.