1. Technical Field
The present invention relates to distributed data processing systems and in particular to fault event management systems.
2. Description of Related Art
Computer networks allow increased computing power, sharing of resources, and communications between users. These networks have grown to represent large investments on the parts of businesses, governments and educational institutions and these organizations spend large amounts of time and money maintaining their networks. According to industry research, an average 5000-user corporate network costs more than $6.4 million to support each year. Thus, to many network decision makers the real concern, as we head into the 21st century, is not so much migrating to faster technologies such as asynchronous transfer mode (ATM), but reducing the costs associated with supporting and operating the networks they use today.
One of the principle costs associated with maintaining a network is the time spent on system management. Networks are not static systems. As companies and organizations grow and change, so do their networks. Thus, network devices are constantly being added or replaced to meet the changing needs of the people using the network. When new devices are added or old ones replaced, the new devices need to be integrated into the Fault Management System. A Fault Management System monitors the hardware portions and software applications of the network for failures. Currently, this involves reprogramming various aspects of the network to ensure that all aspects of the network function correctly.
However, networks have become much larger and complex over the years and changes to the network take place much more frequently. Many network administrators have limited knowledge or interest in programming. Furthermore, even those administrators with the knowledge to reprogram the network lack the time to do so. Therefore, it is desirable to have an error free dynamic method to easily integrate monitoring changes into the Fault Management System in a matter of seconds.
The present invention provides a method for monitoring faults within a computer network. In a preferred embodiment, an event, a host, and a fault monitoring point triplet are received from a monitored network device. A database of valid fault monitoring points is consulted to determine the validity of the event, host, and fault monitoring point triplet received. Responsive to a determination that the event, host, and fault monitoring point triplet received are valid, the appropriate party to notify and the appropriate message to send are determined. The appropriate party is then sent a message alerting them to the network problem. Different parties may be notified depending on the nature of the event or on the location of the event. Furthermore, a new network device may be added without taking down the fault monitoring system by merely adding to the database of valid fault monitoring points a new fault monitoring point corresponding to the added network device.