1. Field of the Invention
The present invention generally relates to a technology for sending a notification to a network management device when a failure is detected in a network. More particularly, the present invention relates to preventing overloading of the network management device with regard to failure monitoring.
2. Description of the Related Art
Existing failure monitoring systems employ a network management device for managing a failure that occurs in any information processing device in a network. In such a failure monitoring system, the network management device receives a failure notification from an information processing device in which a failure is detected, and outputs the failure notification to a monitor, etc. to inform a network administrator of the failure.
Generally, failure in one information processing device leads to failure in other information processing devices in the same network. Therefore, several failure notifications may be output to the monitor due to the same. failure, which makes it difficult to pinpoint the information processing device in which the primary failure occurred.
A conventional technology to solve this problem is disclosed in, for example, Japanese Patent Laid-Open Publication No. 2003-152722. According to the conventional technology, a master-slave relationship is established among information processing devices. When receiving failure notifications from both master and slave information processing devices due to the same failure, the network management device does not allow the failure notification from the slave information processing device to be output to the monitor so that only the failure notification from the master information processing device is displayed.
Similarly, in, for example, a Wavelength Division Multiplexing (WDM) device used in an optical network, a plurality of information processing devices (hereinafter, “agent device”), each with a central processing unit (CPU), are interconnected by an internal Local Area Network (LAN). The agent devices send a failure notification (hereinafter, “alarm notification”) to an internal LAN managing device (hereinafter, “manager device”) connected to the internal LAN.
The manager device designates the one among alarm notifications that needs to be reported as a failure notification based on alarm mask condition (hereinafter, “alarm masking”) maintained beforehand, and sends the failure notification to the monitor. The term “alarm mask condition” as used herein refers to a prerequisite for identifying the source of failure based on correlation of the alarm notifications.
FIG. 12 is a schematic for explaining failure monitoring in a conventional WDM device. A WDM device 10 is connected to a monitor 20 via an external LAN 30, to other optical transmission devices by an optical network 40, and to other WDM devices by a WDM network 50. The WDM device 10 includes a manager device 11, and agent devices 12 to 14 connected via an internal LAN 15.
The agent devices 12 to 14 are connected by an optical fiber cable 16. Each of the agent devices 12 to 14 includes a CPU and is capable of operating autonomously. Each of the agent devices 12 and 13 has installed thereon an optical amplification/dispersion compensation package, a DEMUX/MUX package for demultiplexing/multiplexing optical signals, and an optical switch package. The agent device 14 has installed thereon a transponder package that performs wavelength conversion of optical signals input to and output from the WDM device 10. The WDM device 10 functions as a single optical transmission device due to the autonomous execution of the various program packages by the agent devices 12 to 14.
Each of the agent devices 12 to 14 periodically monitors itself to check for any failure, and if a failure is detected, sends an alarm notification to the manager device 11. The manager device 11 collects alarm notifications sent from the agent devices 12 to 14, and sends a failure notification to the monitor 20 after alarm masking. For example, as shown in FIG. 12, when a failure occurs in the agent device 14 (primary alarm) and this leads to failure in the agent devices 12 and 13 (secondary alarm), the manager device 11 receives an alarm notification from each of the agent devices 12 to 14. After performing alarm masking based on the alarm mask condition maintained beforehand, the manager device 11 sends a failure notification (in this example, a notification of the failure of the source of failure, agent device 14) to the monitor 20.
However, in the conventional failure monitoring system described above, the manager device (network management device) collects alarm notifications (failure notifications) issued by the agent devices (information processing device) in the network, and performs alarm masking. Consequently, load for failure monitoring concentrates on the managing device.
Further, when the packages installed on the agent devices are upgraded, or when dynamic switching of Label Switch Path (LSP), such as MultiProtocol Label Switching (MPLS) or Generalized Multi-Protocol Label Switching (GMPLS), is carried out by a Label Switch-router (LSR), it is necessary to change the alarm mask condition maintained beforehand by the manager device. In addition, packages on the manager device need to be upgraded. During the upgrading process, the manager device is disconnected from the monitor, and the monitoring of the network is temporarily interrupted.