1. Field of the Invention
The present invention relates to a network management system, and in particular to a network management system performing a fault management process in a hierarchical network.
2. Description of the Related Art
A fault restoration process in a prior art hierarchical (layered) network management system is performed closed in each layer, in many cases. Specifically, in an Internet Protocol (IP) network which has been used increasingly in recent years, a method of performing an independent fault management in each layer is generally known due to a historical background that management entities or managers for an IP layer and a lower layer thereof are different.
FIG. 11 shows a schematic diagram of the prior art network management system. ATM switches SW1-SW3 composing a lower layer L2 of a hierarchical network NW, which is an IP network, are respectively connected to routers RT1-RT3 composing an upper layer L3.
The ATM switches SW1-SW3 are mutually connected with information lines for passing user data. The ATM switches SW1-SW3 are connected to an L2 fault manager 200 with control lines (indicated by dotted lines) for notifying fault information apart from these information lines.
Also, a computer 10 is connected to the router RT1, computers 21 and 22 are connected to the router RT2, and a computer 30 and an L3 fault manager 100 are connected to the router RT3.
In FIG. 11, because of no cooperative function between the L3 fault manager 100 and the L2 fault manager 200, an individual layer fault management is respectively performed.
Generally, in a fault management method of an IP layer (upper layer), connection confirmation data packets are periodically exchanged between packet switching nodes over the network so that if the connection confirmation data packets are not received within a fixed number of trials, it is determined that there is a fault at the other node or a link fault toward the other node. The fault is dealt with by selecting another route (next node) for transmitting the data packets.
Moreover, in case the packet switching node performs a packet transmission according to a quality policy such as a priority, the fault manager of the IP layer which has received fault information from a certain packet switching node extracts a packet switching node on an alternate route (hereinafter referred to as an alternate node), and resets the priority to each alternate node to realize an end-to-end priority process.
This operation will be specifically described referring to FIG. 12.
FIG. 12 shows a fault management in the upper layer L3 as a performance example of the above-mentioned independent layer fault management. In FIG. 12, the routers RT1-RT3 are the packet switching nodes composing the upper layer L3 of the network NW. The L3 fault manager 100 which performs the fault management of this upper layer L3 is composed of a fault detector 101, a node setting portion 102, a fault process determining portion 103, an L3 connecting information storage portion 104, and a priority information storage portion 105.
The routers RT1-RT3 are mutually exchanging connection confirmation data packets periodically. For example, if a fault occurs between the routers RT1 and RT2 as shown in FIG. 12, the following processes (1)-(5) are performed:
(1) Since the exchange of the connection confirmation data packets between the routers RT1 and RT2 is disabled, the router RT1 or RT2 detects the fault (it is assumed in the following description that the router RT1 detects the occurrence of the fault). Simultaneously with the fault detection, the router RT1 starts alternate routing the data to the router RT3.
(2) The router RT1 notifies the L3 layer fault manager 100 of a trap (i.e. fault information).
(3) In the L3 fault manager 100, the following processes are performed:                (3-1) The fault notification is received from the router RT1 at the fault detector 101. Based on the contents of the fault notification, the fault process determining portion 103 refers to the L3 connecting information storage portion 104 to extract the node RT3 on the alternate route.        (3-2) The fault process determining portion 103 compares the settings of the routers RT1 and RT3, and determines that a quality policy is unset in the router RT3.        (3-3) The fault process determining portion 103 extracts priority information required to be set in the router RT3 from the priority information storage portion 105, and instructs the node setting portion 102 to set the priority information in the router RT3.        
In order to describe the contents of the L3 connecting information storage portion 104, a connection state of the upper layer L3 in the network NW of FIG. 11 is shown in FIG. 13, in which the routers RT1-RT3 are mutually connected with virtual links VL1-VL3.
In the L3 connecting information storage portion 104, the connection (virtual link) between the routers RT1 and RT2 is stored, in the form of data, as the virtual link VL1 from a port 1 of the router RT1 (RT1/port1) to a port 1 of the router RT2 (RT2/port1).
In the same way, the virtual links VL2 and VL3 are respectively stored as the virtual link from a port 2 of the router RT2 (RT2/port2) to a port 1 of the router RT3 (RT3/port1), and the virtual link from a port 2 of the router RT1 (RT1/port2) to a port 2 of the router RT3 (RT3/port2).
It is to be noted that the L2 fault manager 200 also performs a management similar to the L3 fault manager 100, and includes L2 connecting information storage portion 204 corresponding to the L3 connecting information storage portion 104. However, since the priority setting is performed only in the upper layer L3, the L2 fault manager 200 is not provided with a portion corresponding to the priority information storage portion 105.
This will be described referring to FIG. 14 which shows only the lower layer L2 in the network NW of FIG. 11. It is to be noted that in FIG. 14, the ATM switches SW1-SW3 are mutually connected with links LK1-LK3.
The L2 connecting information storage portion 204 stores the connection (link) between the ATM switches SW1 and SW2 as the link LK1 from a port 1 of the ATM switch SW1 (SW1/port1) to a port 1 of the ATM switch SW2 (SW2/port1).
In the same way, the links LK2 and LK3 are respectively stored as the link from a port 2 of ATM switch SW2 (SW2/port2) to a port 1 of the ATM switch SW3 (SW3/port1), and the link from a port 2 of the ATM switch SW1 (SW1/port2) to a port 2 of the ATM switch SW3 (SW3/port2).
It is to be noted that the L2 fault manager 200 differs from the L3 fault manager 100 connected to the routers RT1-RT3 with the information lines for passing the user data in that the L2 fault manager 200 is connected to the ATM switches SW1-SW3 with the control lines (indicated by dotted lines) apart from the information lines for passing the user data. Therefore, it is made possible to separately manage the fault of the ATM switch itself and the link fault.
FIG. 15 shows an upper layer L3 as a network state before the fault occurrence in the network of FIG. 11. In this case, it is assumed that the priority information as the quality policy is set in the routers RT1 and RT2 so that data from the computer 10 addressed to the computer 21 are transmitted with a high priority while data from the computer 10 addressed to the computer 22 are transmitted with a low priority. However, this quality policy (priority information) is not set in the router RT3.
A case where a fault occurs on the link LK1 in FIGS. 11 and 14 will now be considered. The link LK1 offers a physical link between the ATM switches SW1 and SW2 respectively connected to the routers RT1 and RT2. The occurrence of a fault on this link LK1 will lead to the fault of the virtual link VL1 shown in FIG. 15.
The fault of the link LK1 is immediately notified to the L2 fault manager 200. However, because of no cooperative function between the L2 fault manager 200 and the L3 fault manager 100, the L3 fault manager 100 cannot detect the fault of the virtual link VL1 until the fault notification is received from the router RT1 or RT2 by the above-mentioned general fault management method in the IP network. Therefore, it takes time from the occurrence of the fault on the link LK1 to the fault detection by L3 fault manager 100.
This will be described referring to FIG. 16.
FIG. 16 shows a network state in case a fault occurs after the state of FIG. 15. The router RT1 starts alternate routing to the router RT3 simultaneously with a fault detection (see FIG. 16(1)), and notifies to the L3 fault manager 100 that a fault has occurred at the RT1/port1 (see FIG. 16(2)).
However, since the quality policy is not set in the router RT3, it is not possible to provide the service according to the determined quality policy for the data packets passing through the router RT3 until the L3 fault manager 100 re-sets the quality policy in the router RT3 (see FIG. 16(3)).
It may be considered to make the router RT1 store (buffer) the data packets until the quality policy setting in the router RT3 is completed as a measure to observe the quality policy. However, since the disconnected time is prolonged in this case, the deterioration of the communication quality over the entire network cannot be avoided.