1. Field of the Invention
The present invention relates to network management systems, including systems used for systems monitoring, fault detection, troubleshooting, fault resolution, and systems maintenance.
2. Description of the Related Art
Existing network management and maintenance technology typically requires manual intervention when a network device encounters a failure. A network manager, upon isolating the network device having the problem, typically will need to perform a swap of hardware components, or attempt to load updated software in order to attempt to resolve the problem. Almost invariably, however, a network manger will need to obtain additional technical information for managing or troubleshooting the network device. For example, the network manager may need to access a technical support web page offered by the manufacturers of the network device in order to determine whether “help pages” provide adequate troubleshooting suggestions or failure solutions.
However, as networks and information technology in general become more complex to service and support, the use of a web page based source for troubleshooting information becomes less effective. In particular a human-web page interface often is inefficient for the network manager, since the network manager often may be required to manually supply device type, serial number, software versions, and a brief statement of the problem in order to provide sufficient search parameters for the web page to generate a query in the back-end databases supported by the vendor. Unfortunately, such a query fails to take into consideration more complex interactions that may affect the performance of the network device, including network topology, device configurations, and dynamic parameters that affect the network device performance.
In particular, complex networks are composed of multiple network devices that interact between each other. Moreover, network managers of large networks encounter considerable difficulty in maintaining an accurate inventory of all devices in the network and their respective configurations, including features such as SNMP configuration. Existing network discovery resources, for example ping sweep resources used to locate IP addresses of SNMP-enabled network devices, are inadequate to provide a comprehensive inventory of the network, since even with existing intelligent device searches, it is almost impossible to maintain an adequate overview of the network to a reasonable, updated status.
Another fundamental problem associated with management of complex networks involves providing hardware or software updates to the network devices as the updates are developed by the device manufacturers. Although client side resources (i.e., resources executed on a client device) exist that enable a personal computer to search via the Internet for updated revisions for installed software applications, such an arrangement is impractical for network devices such as switches, routers, gateways, etc., that may be installed in a complex network.
Similar problems exist when a network manager may wish to monitor the operational status of a network device: monitoring the operational status of a network device requires adequate configuration of the management features within the device, and of the management station collecting the management data. Highly skilled and trained staff are then required to analyze the feedback from the management system, and to maintain the management system. Network managers, upon obtaining data that describes the operational status of a network device, are left with the same problem of how to interpret the data describing the device status, in the context of the device operations within the network, and relative to other similar devices having the same configuration and networks having similar network topologies. Moreover, attempts to contact technical support representatives at a network device manufacturer also may provide limited results, since the technical support personnel do not know exactly what hardware or software features are installed in the customer's network, often causing technical support personnel to suspect that the problem may exist in other devices installed on the network.
Finally, existing network management technology is unable to preempt a network device failure. For example, if a network device runs low on system resources, typically the reduction in system resources will not be detected with sufficient time to take preventive action, rather, it probably will only be detected once the network device has failed. Hence, the users of the network having the failed device are unnecessarily burdened with loss of network service.