Networks provide increased computing power, sharing of resources and communications between users. A network may include a number of computer devices within a room, building, or site that are interconnected by a high speed local data link to form a local area network (LAN), such as a token ring network, ethernet network, or the like. LANs in the same or different locations may be interconnected by different media and protocols such as packet switching, microwave links and satellite links to form a wide area network. There may be several hundred or more interconnected devices in a network.
As a network becomes larger and more complex, issues arise as to the amount of traffic on the network, utilization of resources, security and the isolation of network faults. In U.S. Pat. No. 5,436,909, which issued to Roger Dev et al. on Jul. 25, 1995, and which is herein incorporated by reference in its entirety, a system for isolating network faults is disclosed. In the '909 patent, a network management system models network devices and relations between network devices. A contact status of each device is contained in a corresponding model. Each model receives status updates from and/or regularly polls the corresponding network device.
The '909 patent uses a technique known as "status suppression" in order to isolate network faults. When a first network device has lost contact with its corresponding model, the models which correspond to network devices adjacent to the first network device are polled to see if they have also lost contact with their corresponding network devices. If the adjacent models cannot contact their corresponding network devices, then presumably the first network device is not the cause of the fault and a fault status in the first model is suppressed or overridden. If it is determined that all adjacent network devices are not communicating, then the network fault can be more easily determined as something common to all of these devices.
It may be advantageous to focus the failure analysis on the first network device without polling all of the adjacent network devices. In some large networks, such polling could involve hundreds, possibly thousands, of network devices thereby increasing the amount of traffic on the network and degrading network performance. In addition, there may be network devices that, although they have lost contact with the network management system, are still in contact with some other network device.
It is an object of the present invention to provide a method to facilitate fault management in a network which can be used alone or together with other fault management services to deduce the location and/or cause of a network failure.