1. Field of the Invention
The present invention relates to a method and apparatus for managing and maintaining a data communications network. More particularly, the present invention relates to a method and apparatus for distributed data communications network management having the capability to remotely manage the recovery of service components experiencing errors or failures and the capability to incorporate manually started new service components into the data communications network.
2. The Background
The ability to provide data communications networking capabilities to the personal user and the professional community is typically provided by telephone companies (Telcos) or commercial Internet Service Providers (ISPs) who operate network access points along the information superhighway. Network access points which are commonly referred to as Points of Presence or PoPs are located within wide area networks (WAN) and serve to house the network interfaces and service components necessary to provide routing, bridging and other essential networking functions. It is through these network access points that the user is able to connect with public domains, such as the Internet and private domains, such as the user's employer's intranet.
The ISPs and Telcos maintain control of the network interfaces and services components comprising the data communications network at locations commonly referred to as Network Operation Centers (NOCs). It is here, at the NOCs, where the ISPs and Telcos employ service administrators whose task is to maintain and manage a finite sector of the overall data communications network. Managing and maintaining the interfaces and services that encompass the network is complicated. The interfaces and services that a system administrator has responsibility for are not confined to the NOC, but rather remotely dispersed throughout the PoPs. For example, the NOC may be located in San Jose, Calif. and the services and interfaces for which the system administrator has responsibility for may be located at PoPs in San Francisco, Calif., Los Angeles, Calif. and Seattle, Wash. Part of the challenge with system administration is the ability to identify problems and potential problems in a timely manner. With a system distributed world-wide, having an ever-growing number of hosts and servers, it becomes nearly impossible and economically undesirable to have system administrators providing constant surveillance for all the components.
It is the common knowledge of anyone who has used computers in a network environment that problems related to the interfaces and services are the rule and not the exception. The vast majority of these problems are minor in nature and do not require the system administrator to take action. Networks have been configured in the past so that these minor errors are self-rectifying; either the interface or service is capable of correcting its own error or other interfaces or services are capable of performing a rescuing function. In other situations the problems that are encountered within the network are major and require the system administrator to take action; i.e., physically rerouting data traffic by changing interfaces and services.
It is the desire of the service providers to have a maintenance and management system for a data communications network that allows the system administrator to manage and maintain the data communications network remotely. The move is towards hands-off system administration that affords the service providers the capability to manage data communications networks without the need to have system administrators physically located at the NOC's management operation host on a 24 hour basis. This type of remote system administration can only be achieved if the management system has self-rectifying capability and the know-how to remotely notify the system administrator when severe errors or failures occur within the services. When major errors or service component failures occur, the system administrator must be notified in a prompt and efficient manner so that immediate action can be taken. The objective of the network management system should be to provide for a mechanism whereby system administrators can be remotely notified on an around-the-clock basis whenever a specified severe error may occur, has occurred or when a service failure has occurred. Once the system administrator is notified remotely then the system administrator can adjust the data communications network accordingly via remote network management system access, use of a node interactive access application such as Telnet or an equivalent mechanism.
Additionally, a comprehensive data communications network management system will benefit from being able to acknowledge and acquire information at the operation center host from network services and interfaces that are manually added to the network or manually started. Manually, in this sense, refers to services that are started or added at one of the numerous PoPs in the distributed data communications network without a command to do so being issued from the network management operation center. It would be highly beneficial for the service provider to automatically add this service to the management system without having to physically acquire data related to that service and manually input the data into the network management system. When services can be added to the distributed data communications network management system in a seamless manner it furthers the objective of limiting system administrator interface with the network management system. In this manner the service provider is able to maintain and manage the data communications network without the need for having more personnel than necessary to monitor and manipulate the network on an ongoing basis.