1. Field of the Invention
The present invention relates to a method and apparatus for managing and maintaining a data communications network. More particularly, the present invention relates to a method and apparatus for distributed data communications network management having the capability to remotely manage the recovery of service components experiencing errors or failures and the capability to incorporate manually started new service components into the data communications network.
2. The Background
The ability to provide data communications networking capabilities to the personal user and the professional community is typically provided by telephone companies (Telcos) or commercial Internet Service Providers (ISPs) who operate network access points along the information superhighway. Network access points which are commonly referred to as Points of Presence or PoPs are located within wide area networks (WAN) and serve to house the network interfaces and service components necessary to provide routing, bridging and other essential networking functions. It is through these network access points that the user is able to connect with public domains, such as the Internet and private domains, such as the user""s employer""s intranet.
The ISPs and Telcos maintain control of the network interfaces and services components comprising the data communications network at locations commonly referred to as Network Operation Centers (NOCs). It is here, at the NOCs, where the ISPs and Telcos employ service administrators whose task is to maintain and manage a finite sector of the overall data communications network. Managing and maintaining the interfaces and services that encompass the network is complicated. The interfaces and services that a system administrator has responsibility for are not confined to the NOC, but rather remotely dispersed throughout the PoPs. For example, the NOC may be located in San Jose, Calif. and the services and interfaces for which the system administrator has responsibility for may be located at PoPs in San Francisco, Calif., Los Angeles, Calif. and Seattle, Wash. Part of the challenge with system administration is the ability to identify problems and potential problems in a timely manner. With a system distributed world-wide, having an ever-growing number of hosts and servers, it becomes nearly impossible and economically undesirable to have system administrators providing constant surveillance for all the components.
It is the common knowledge of anyone who has used computers in a network environment that problems related to the interfaces and services are the rule and not the exception. The vast majority of these problems are minor in nature and do not require the system administrator to take action. Networks have been configured in the past so that these minor errors are self-rectifying; either the interface or service is capable of correcting its own error or other interfaces or services are capable of performing a rescuing function. In other situations the problems that are encountered within the network are major and require the system administrator to take action; i.e., physically rerouting data traffic by changing interfaces and services.
It is the desire of the service providers to have a maintenance and management system for a data communications network that allows the system administrator to manage and maintain the data communications network remotely. The move is towards hands-off system administration that affords the service providers the capability to manage data communications networks without the need to have system administrators physically located at the NOC""s management operation host on a 24 hour basis. This type of remote system administration can only be achieved if the management system has self-rectifying capability and the know-how to remotely notify the system administrator when severe errors or failures occur within the services. When major errors or service component failures occur, the system administrator must be notified in a prompt and efficient manner so that immediate action can be taken. The objective of the network management system should be to provide for a mechanism whereby system administrators can be remotely notified on an around-the-clock basis whenever a specified severe error may occur, has occurred or when a service failure has occurred. Once the system administrator is notified remotely then the system administrator can adjust the data communications network accordingly via remote network management system access, use of a node interactive access application such as Telnet or an equivalent mechanism.
Additionally, a comprehensive data communications network management system will benefit from being able to acknowledge and acquire information at the operation center host from network services and interfaces that are manually added to the network or manually started. Manually, in this sense, refers to services that are started or added at one of the numerous PoPs in the distributed data communications network without a command to do so being issued from the network management operation center. It would be highly beneficial for the service provider to automatically add this service to the management system without having to physically acquire data related to that service and manually input the data into the network management system. When services can be added to the distributed data communications network management system in a seamless manner it furthers the objective of limiting system administrator interface with the network management system. In this manner the service provider is able to maintain and manage the data communications network without the need for having more personnel than necessary to monitor and manipulate the network on an ongoing basis.
A method for providing remote management and maintenance of a node or service within a data communications network that is initiated by the data communications network management system""s failure to receive operational status signals from a node or service. A control adapter running on a node within a Point of Presence is started. The control adapter is capable of starting all service adapters associated with all services running on the node. Operational status signals are transmitted from the control adapter and service adapters on to an information bus. If a network management control host fails to receive operational status signals, notification is sent to a remote system administrator that no signals are being received from a node or service. The system administrator can take appropriate remote action to rectify the problem.
In another aspect of the invention, remote management and maintenance of a node or service within a data communications network is initiated by the data communications network management system""s receipt of abnormal condition signals from a node or service. A control adapter running on a node within a Point of Presence is started. The control adapter is capable of starting all service adapters associated with all services running on the node. Abnormal condition signals are transmitted from the control adapter and service adapters on to an information bus when warnings and errors are encountered. If a network management control host receives abnormal condition signals that dictate remote system administrator notification, then notification is sent to a remote system administrator that abnormal conditions exist at the node or the service. The system administrator can take appropriate remote action to rectify the problem.
In another aspect of the invention, integration of a manually started node or service into a data communications network is achieved. A node or service is manually started at a Point of Presence within a data communications network. The node or service has an adapter running on it and is in communication with an information bus. The node or service begins signalling operational status upon implementation. These signals are received by a network management control host that fails to recognize the identity of the signals. The network management control host transmits signals asking the newly started node or service for identifying information. The node or service receives the identity request and transmits signals back to the network management control host with specific requested identification information. The network management control host stores this information for identification purposes and later performance analysis.
In yet another aspect of the invention, a network management control host comprises a network management application started on the host at a network operation center. The network management application is in communication with a database adapter and a database. The database adapter is in communication with an information bus. A remote system administrator notifier is in communication with the network management application and the database adapter and provides for remote notification of the system administrator if signals are received related to an abnormal condition at a node or a service or if operational status signals from a node or service are not received.