Network management may be conducted at different levels in various types of networks to avoid network failures and to assure network performance. In a communication network, an element management system (EMS) may be used to supervise and manage network elements within a network. A communication network may also include a network management system (NMS) to manage the overall network by communicating with several EMSs.
In an optical communication system such as a wavelength division multiplexed (WDM) system, for example, terminal or cable stations may be interconnected by cable segments to form a network. The network elements in an optical communication system may include equipment located at a cable station (e.g., terminal equipment and power feed equipment) as well as equipment connected to the cable station (e.g., repeaters and equalizers). In such a system, an EMS may be located at a cable station (or at a separate location) and used to manage the network elements associated with this cable station. The EMS may include one or more servers for performing the element management functions and one or more workstations for providing a user interface (e.g., to display the information associated with the network elements managed by the EMS). An NMS may be located at one of the cable stations or at a separate location for managing the overall optical communication system or network.
The management of a network may include configuration management, fault management and performance management. An EMS may provide fault management by retrieving, storing and/or displaying alarm, event and system messages forwarded by the network elements managed by the EMS. An EMS may provide performance management by retrieving, storing, displaying and/or measuring transmission quality data. A NMS can provide fault management and performance management for the entire network by managing all of the alarm, event and system messages and the transmission quality data forwarded by each EMS. The NMS may display fault and performance information received from each EMS, e.g. on a network topological map.
One type of information that may be displayed by an NMS is the network alarm status as managed by the underlying EMSs, as shown for example in FIG. 1. A user (e.g., a network administrator or operator) may monitor the displayed information to determine if the network alarms indicate failures in a network, which may cause network outages. Alarm summary information may indicate the level of alarm (e.g., major, minor, none, unavailable/not reporting), and the alarm count of major and minor alarms.
As shown in FIG. 2, alarm status information may be communicated between each EMS server 20 and an NMS 22 using a hierarchical approach. According to one implementation, one or more computers at the NMS may be configured as one or more servers (e.g., a single server or redundant servers) that receive information from EMS servers 20. The NMS may then display the alarm summary information for every EMS in the network (e.g., as shown in FIG. 1).
According to another possible implementation, a NMS may be formed without a physical NMS server or layer by distributing the NMS functionality to the EMS servers (i.e., a mini-NMS feature built into each EMS). With a distributed NMS that does not have a NMS layer, however, it is still desirable to provide a summary view of the status of the complete network. To accomplish this, each EMS may communicate with a single “master” server by presenting the highest level alarm status to the “master” server. In turn, the “master” server may provide to each EMS server a consolidated view of the alarm status for all of the EMS servers throughout the network. The alarm summary information of every EMS in the network (e.g., as shown in FIG. 1) may then be displayed on the EMS workstations. Thus, this distributed NMS approach also uses a hierarchical approach, i.e., with a master EMS server instead of a NMS server.
System operation in a hierarchical approach is heavily dependent upon the NMS server or the master server, which bears the brunt of processing and may be a single point of failure. If the NMS server or the master server fails, or if there is a network fiber break, the alarm and status sharing feature may fail. Also, the simple TCP/IP client/server based communication model available for distributed NMS systems can be inefficient and may require processing and transmission resources.