1. Field of the Invention
The present invention relates to fault detection devices, and more particularly, to a fault detection device for detecting faults on a network.
2. Description of the Related Art
In recent years, telecommunications carriers provide wide area Ethernet (“Ethernet” is a registered trademark) as one of carrier services using LANs (Local Area Networks) and the service is more and more diffusing. Wide area Ethernet is a service whereby a plurality of Ethernet LAN environments are interconnected by Layer 2 switches to be integrated into a single network.
Wide area Ethernet does not require an expensive router (Layer 3 switch) but uses an inexpensive switching hub (Layer 2 switch), and accordingly, the costs involved in the configuration and operation management of networks can be cut down. It is possible, for example, to connect corporate LANs so as to cover the entire area of a city.
If, in such carrier networks, a network fault occurs and recovery therefrom is delayed, serious damage is caused. Thus, carrier networks adopt a redundant configuration at various levels such as duplication of various packages in devices, duplication of devices per se, duplication of links, and duplication of end-to-end paths. When a fault is detected, switchover to a redundant system is promptly effected, thereby enhancing fault tolerance.
Redundant configuration is, however, meaningless unless a device itself can quickly detect faults with accuracy. Where a silent fault (fault which does not trigger off automatic switchover to the redundant system or notification of an alarm to the operator and thus it is difficult to distinguish anomaly in operation) has occurred, the fault period lasts long, and in the case of an Ethernet network, looping is caused, possibly entailing congestion of frames.
FIGS. 11 and 12 illustrate a network fault caused by looping. A network 50 is constituted by nodes 51 to 54 connected together in the form of a ring, and the nodes 51 and 54 are connected to each other (in the figure, thin solid lines indicate physical links).
In Ethernet networks, paths are configured to have a tree structure so that no looped path may exist. This is because, if a looped path exists in the network, congestion of frames called broadcast storm is caused when a frame is broadcasted.
Specifically, when a broadcast frame is sent out from a certain node, all ports of each node except the receive port are flooded with the broadcast frame. Thus, if a loop exists in the network, broadcast frames endlessly circulate through the same looped path.
If this occurs, the broadcast frames instantly fill up the band, making normal communications unavailable. For example, the network 50 shown in FIG. 11 has a loop R, and thus the broadcast frames endlessly circulate through the loop R, as via the node 51→node 52→node 53→node 54→node 51→ . . . , causing congestion.
When configuring an Ethernet network, therefore, it is necessary to logically block loops of the Layer 2 network. For example, links L1 and L2 are logically blocked as shown in FIG. 12, then all loops are eliminated, forming a tree structure T (in the figure, indicated by the thick solid line).
To form such a tree (called spanning tree), control information called BPDU (Bridge Protocol Data Unit) is exchanged among the nodes according to STP (Spanning Tree Protocol) control, for example, to dynamically alter traffic paths (in the case of static operation, the individual nodes are previously set so as to form tree paths). This prevents the situation where frames endlessly circulate through a loop, even if the network has a physical loop.
However, even with the spanning tree formed as shown in FIG. 12, if the fault detection function fails to effectively work due to a silent fault occurring in the network 50, consistency of routing information among the nodes is lost, destroying the tree structure and possibly creating a loop. To cope with a silent fault, therefore, it is important not only to employ redundant configuration but to detect faults with high accuracy.
As conventional techniques for Ethernet fault detection, a technique of conducting a loopback test by means of a device within a LAN has been proposed (e.g., Japanese Unexamined Patent Publication No. 2003-304264 (paragraph nos. [0020] to [0027], FIG. 1)).
In cases where a silent fault as mentioned above has occurred, switchover from the operational system to the redundant system is not effected until customers' complaints about the service are received. Thus, since the service is disrupted for a long period of time, vendors have been making attempts to create their own vendor-specific protocols for detecting network faults.
FIG. 13 illustrates an exemplary procedure for detecting ordinary network faults by means of a vendor-specific protocol. Nodes A and B are connected to each other by links L3 and L4.
[S21] The node A transmits a frame Fa to the node B through the link L3. The frame Fa includes a local node identification (ID) field and a remote node identification field. When transmitting the frame Fa, the node A inserts “A”, which is indicative of itself, into the local node identification field and inserts “B”, which is described in the local node identification field of a frame Fb received via the link L4, into the remote node identification field as redirected information.
[S22] The node B transmits a frame Fb to the node A through the link L4. The frame Fb also includes a local node identification field and a remote node identification field. When transmitting the frame Fb, the node B inserts “B”, which is indicative of itself, into the local node identification field and inserts “A”, which is described in the local node identification field of the frame Fa received via the link L3, into the remote node identification field as redirected information.
[S23] On receiving the frame Fb, the node A stores, in its memory, the information “B” described in the local node identification field of the frame Fb (i.e., the information indicating that the remote node is the node B). It is assumed here that the information in the memory ages (the information in the memory is cleared and updated) in one minute and that the frame Fb is transmitted at intervals of 20 seconds.
[S24] On receiving the frame Fa, the node B stores, in its memory, the information “A” described in the local node identification field of the frame Fa (i.e., the information indicating that the remote node is the node A). Also in this case, the information in the memory ages in one minute and the frame Fa is transmitted at intervals of 20 seconds.
[S25] If the frame Fb fails to reach the node A three times consecutively, then the memory is cleared. In this case, the node A generates a frame Fa-1 having “0” inserted in the remote node identification field and transmits the generated frame to the node B. Also, since the frame Fb did not arrive three times consecutively, the node A judges that a fault has occurred in the link L4 (or the associated node B).
[S26] The node B receives the frame Fa-1 and recognizes that “0” has been inserted in the remote node identification field. Namely, the node B recognizes that the node identification name (B) of its own is not communicated to the node A and that the frame Fb transmitted therefrom is not normally received by the node A, and thus concludes that a fault has occurred in the link L4 (or the associated node A).
In this manner, each transmitting/receiving node can detect a fault by detecting the “non-reception of the control frame from the associated device over a fixed period of time” and the “discrepancy between the information transmitted from the local device and the redirected information in the control frame received from the associated device.”
However, existing network fault detection techniques based on vendor-specific protocols are subject to the precondition that the local device is aware of what the associated device monitors, as in the above example, and have the problem that the techniques can be applied only to the links connected by the devices of the same vendor's make within the network.
Thus, the existing techniques are not convenient enough in view of the fact that more and more carrier networks are configured in multi-vendor environments, and at present, standardization of network fault detection protocols is yet to be developed.
Accordingly, there has been a strong demand for highly fault-tolerant techniques which permit network faults to be detected in multi-vendor environments, without the need for interoperation between a local device and its associated device according to an identical protocol and without making the associated device aware of fault monitoring.