Network management may be conducted at different levels in various types of networks to avoid network failures and to assure network performance. In a communication network, an element management system (EMS) may be used to supervise and manage network elements within a network. A communication network may also include a network management system (NMS) to manage the overall network by communicating with several EMSs, which manage smaller domains of the network.
In an optical communication system, for example, terminal or cable stations may be interconnected by cable segments to form a network. The network elements in an optical communication system may include equipment located at a cable station (e.g., terminal equipment and power feed equipment) as well as equipment connected to the cable station (e.g., repeaters and equalizers). In such a system, an EMS may be located at a cable station (or at a separate location) and used to manage the network elements associated with this cable station. The EMS may include one or more servers for performing the management functions and one or more workstations for providing a user interface (e.g., to display the information associated with the network elements managed by the EMS). An NMS may be located at one of the cable stations or at a separate location for managing the overall optical communication system or network.
The management of a network may include configuration management, fault management and performance management. An EMS can provide fault management by retrieving, storing and/or displaying alarm, event and system messages forwarded by the network elements managed by the EMS. An EMS can provide performance management by retrieving, storing, displaying and/or measuring transmission quality data. A NMS can provide fault management and performance management for the entire network by managing all of the alarm, event and system messages and the transmission quality data forwarded by each EMS. The NMS may display fault and performance information received from each EMS on a network topological map.
One type of information that may be displayed by an NMS is the network alarm status as managed by the underlying EMSs, as shown in FIG. 1. A user (e.g., a network administrator or operator) may monitor the displayed information to determine if the network alarms indicate failures in a network, which may cause network outages. Alarm summary information may indicate the level of alarm (e.g., major, minor, none, unavailable/not reporting), and the alarm count of major and minor alarms.
As shown in FIG. 2, alarm status information may be communicated between each EMS server 10 and an NMS 12 using a hierarchical approach. According to one implementation, one or more computers at the NMS may be configured as one or more servers (e.g., a single server or redundant servers) that receive information from EMS servers 10. The NMS may then display the alarm summary information for every EMS in the network (e.g., as shown in FIG. 1).
According to another possible implementation, a NMS may be formed without a physical NMS server or layer by distributing the NMS functionality to the EMS servers (i.e., a mini-NMS feature built into each EMS). With a distributed NMS that does not have a NMS layer, however, it is still desirable to provide a summary view of the status of the complete network. To accomplish this, each EMS may communicate with a single “master” server by presenting the highest level alarm status to the “master” server. In turn, the “master” server provides to each EMS server a consolidated view of the alarm status for all of the EMS servers throughout the network. The alarm summary information of every EMS in the network (e.g., as shown in FIG. 1) may then be displayed on the EMS workstations. Thus, this distributed NMS approach also uses a hierarchical approach, i.e., with a master EMS server instead of a NMS server.
Although the hierarchical approach to communicating alarm status data may work for small systems with simple data communication networks (i.e., small numbers of EMS servers), performance and reliability may be compromised in larger systems, for example, when the number of EMS servers approach that found in undersea optical communication systems. The simple TCP/IP client/server based communication model available for distributed NMS systems can be inefficient and may require processing and transmission resources. System operation is also heavily dependent upon the NMS server or the master server, which bears the brunt of processing and may be a single point of failure. If the NMS server or the master server fails, the alarm and status sharing feature may fail.
Accordingly, there is a need for a distributed messaging system and method that enables sharing of network status data between servers, such as EMS servers, in a manner that is relatively simple and reliable.