(1) Field of the Invention
The present invention relates to a network monitoring program, a network monitoring method, and a network monitoring apparatus for monitoring the operating state of a network, and more particularly to a network monitoring program, a network monitoring method, and a network monitoring apparatus for detecting a fault that has occurred on a network.
(2) Description of the Related Art
As the information technology develops, many business enterprises are making efforts to rely upon computer systems to perform their business activities efficiently. A computer system has a plurality of communication units such as computers, switches, etc. connected to each other by a network. Networks are becoming larger in scale year after year because of an increasing range of business activities that can be performed by computers.
In view of attempts to make open and standardize system architectures, it has become possible to construct networks of a combination of apparatus manufactured by different manufacturers. Furthermore, efforts are being made to make apparatus on networks more intelligent, resulting in more complex network configurations.
If a large-scale, complex network suffers trouble, then the operating states of apparatus that make up the network are confirmed. However, there are many instances where a network fault cannot be judged based on the operating states of individual apparatus. Consequently, specifying the location and cause of a network failure is a highly difficult task to carry out. In addition, if the location and cause of a network failure cannot be found for a long period of time, then business activities of customers which rely on the network are suspended for a long time.
There has been proposed a technique for linking network design information and apparatus operation statistic information to each other and also for linking different protocol layers such as an IP (Internet Protocol) layer and an ATM (Asynchronous Transfer Mode) layer to each other display a list of operation statistic information (see, for example, Japanese unexamined patent publication No. 2002-99469 (paragraphs [0043]-[0044])). According to the proposed technique, operation statistic information is periodically collected from apparatus on a network, and the collected operation statistic information is compared with an index value. If the operation statistic information is in excess of the index value, then it is judged that a fault symptom has occurred. When symptom has occurred. When a fault symptom is detected, a list of operation statistic information with respect to apparatus that have produced the fault symptom is displayed to help specify a range in which the fault symptom has occurred.
However, though the technique disclosed in Japanese unexamined patent publication No. 2002-99469 can automatically detect a fault symptom, the location and cause of the fault have to be determined by the system administrator. For example, if data transmitted from an apparatus to another apparatus does not reach the other apparatus, then the conventional monitoring system allows the apparatus which has transmitted the data to detect the error. However, the conventional monitoring system is unable to automatically determine where a fault has occurred on a communication path from the source apparatus to the destination apparatus.
Heretofore, as described above, though it is possible to automatically detect a fault symptom from operation statistic information of each of the apparatus on the network, it is the system administrator who identifies an actual fault location. Consequently, it has been customary to spend an excessive period of time to perform a fault analysis. Since it is more difficult for larger-scale systems to identify a fault location, increased periods of time required for a fault analysis have posed a problem.
Another element which has made it difficult to carry out a fault analysis is the complexity of functions in each apparatus. Generally, communication functions on a network are separate in different layers. It is important to specify which function is suffering a fault for the purpose of taking a countermeasure against the fault. However, the conventional monitoring system does not have a monitoring function for a transport layer level. Though the conventional monitoring system has a monitoring function based on the monitoring function (ICMP (Internet Control Message Protocol) function) of network apparatus, the monitoring function does not depend on actual communication statuses, and the monitoring system may make a wrong decision. It has thus been difficult to accurately detect a fault of these functions.