1. Field of the Invention
The present invention relates to remote node maintenance and management in communication networks and more particularly, to a system and method for providing access to communication network nodes and for monitoring each type of resource within the network nodes in a fast, reliable and efficient manner.
2. Description of the Related Art
Network management is the art of managing communication networks. Almost everybody uses a communication network at one time or another without always being aware of it. Using an automatic teller machine is an example of using a communication network in daily life. A communication network provides means for transporting data, voice or video from one computer to another via a collection of devices, cables, circuits, etc. Linking computers through a network allows the sharing of computing power and information between users and thus increases efficiency and productivity. Significant and sometimes huge amounts of money are invested in communication networks by organizations and companies. The maintenance of large and complex networks is a considerable task and today, even on medium sized networks, the maintenance is computer assisted so as to be more cost effective. This automated assistance is known as "network management". This process collects data related to the network (manually or automatically), processes the data, and synthesizes it in a human readable form to facilitate network operations. More sophisticated systems analyze data and suggest or even implement solutions. Moreover, some systems are capable of generating reports and statistics.
Network management such as defined by the International Standards Organization (ISO) is divided into five functional areas:
1. fault management, PA0 2. configuration management, PA0 3. security management, PA0 4. performance management, PA0 5. accounting management. PA0 1. discovering the problem, PA0 2. isolating the problem, and PA0 3. fixing the problem. PA0 1. centralized, PA0 2. distributed, PA0 3. hierarchical. PA0 1. pages of text in HTML (Hypertext Markup Language) format--the web browser is able to understand the HTML in order to present the text in a convenient way for human user; PA0 2. images or sounds in different kinds of format that the web browser is able to interpret. PA0 1. the client part (PETC Client) provides the user interface with a command line accessible through the keyboard of the console; PA0 2. the server part (PETC Server) receives the command from the client and forwards it to the destination resource in the node such as line adapter, switch, etc. PA0 1. a graphical user interface such as the FEUI (Front End User Interface) in the IBM 2220 Nways Broadband Switch, and PA0 2. a node agent residing in the node. PA0 1. installing a web server inside each node to perform the specific actions requested by the web browser. PA0 2. using a centralized web server to send standard SNMP or CMIP requests to the node to examine (just as traditional network management products would do). PA0 1. The HTTP protocol is stateless: there is no notion of session, no persistence of objects. From one transaction to the next the web server forgets everything. PA0 2. As the HTTP protocol works in a request/response manner, there is no way to receive unsolicited messages from the network. SNMP traps and CMIP alarms which are essential to fault management rely on unsolicited messages. Continuous polling of the network to track the network node's state changes is ruled out as a solution because it is a waste of bandwidth. PA0 1. A specialized web server using the Common Gateway Interface (CGI) to execute applications. These applications build and send commands to the network nodes, analyze the responses and format them according to the HTML protocol. Applications are triggered in response to a request from a web browser. The response in HTML form is sent back to the web browser. The web server will be referred to below as "Network Web Agent Proxy" or NWAP. PA0 2. Multiprotocol agents which are applications running inside each network node. The multiprotocol agent, referred to below as "Network Web Agent Daemon" or NWAD, receives commands from an NWAP web server and routes them to the appropriate resource inside the node. It then retrieves the response from the node resource and forwards it to NWAP. The NWAD is also able to forward a command to another network node if the node on which it is running is not the final destination node of the command. The NWAD can also execute by itself some commands like a file transfer using a FTP (File Transfer Protocol). PA0 3. Messages exchanged between the web server NWAP and the multiprotocol agent's NWADs. These messages are exchanged using a TCP/IP connection. Each message contains a header to be routed by the NWADs towards the right resource inside the communication network.
Network management is in no way a monolithic process. It is a set of processes cooperating together.
A communication network is made of many pieces of hardware equipment and software programs. For any reason, some may be working wrong or not working at all. The fault management function deals with these failures. It consists in
Discovering the problem can be by means of a telephone call from a user or by means of the faulty device itself reporting the failure if it is able to do it. When the faulty device is down, the neighbors within the network may report a loss of contact or something similar.
Isolating the problem consists in pinpointing the failing component (which may not be the component signalling the error). It is crucial to conduct investigations remotely (i.e. without resorting to travel).
Fixing the problem consists in executing the proper repair action. In present systems, a failing device is generally able to indicate the appropriate corrective action. The capability to download code remotely and to fix software problems constitutes a determinative advantage.
In networks, some devices have a critical function, nodes for instance. For upgrading the hardware and software of these devices, configuration information such as serial numbers, engineering change numbers, supported protocols, etc., must be available to the network engineer. This configuration management information should also be updated "on the fly". Managing access rights and committed quality of service are also part of configuration management.
Security management deals with detection of unauthorized attempts to access restricted parts of the network or protected computers. It also involves managing firewalls, enforcing security on dialup lines, etc.
Performance management involves congestion avoidance, capacity planning, and bandwidth statistics. More particularly, its role is to make sure that the network capacity is adequate with regard to the needs.
Accounting management consists in tracking the use of the network resources by the different users or groups of users. The purpose may be to bill each one according to his resource utilization, making sure that no one exceeds the resources he has reserved or paid for, or to simply ensure fairness of usage. The accounting management function provides the financial data for capacity planning.
Acquisition of information is essential for an efficient network management system. However, networks are generally made up of disparate devices and the information which can be obtained is more often heterogeneous. The purpose of network management protocols such as CMIP (Common Management Information Protocol) or SNMP (Simple Network Management Protocol) is to standardize sets of queries. For each query of the network manager a particular response is specified. If every node in the network implements the same protocol, then network management becomes hardware independent. This is not always realistic: in practice when faults have to be isolated and fixed, these general purpose protocols are often not sufficient or adequate. For instance both a plane and a car may have a flat tire, but the symptoms, consequences and repair actions are specific to each type of vehicle.
Building a network management system that incorporates all the functions previously described implies the development of a set of software applications or tools and their placement according to a convenient architecture in the communication network. The most commonly used architectures are the following:
A centralized architecture implies a network manager in a large central system running the majority of the network management applications.
A distributed architecture consists in multiple peer network managers operating simultaneously, each one being responsible for a part of the network such as a group of countries. A hierarchical architecture is a mix of the two previous: the main central system of the centralized architecture is the root of the hierarchy, accumulating all essential information and allowing access to all parts of the network. Then by setting up peer systems from the distributed architecture, it can delegate some network management tasks that function as children in the hierarchy.
Each one of these approaches involves one or several management consoles at fixed location(s). The recent emergence of Web-based network managers makes mobile management consoles possible. From almost anywhere, with any type of computer, even from a dumb terminal, network engineers can have access to the network, obtain the data they want, and perform preventive and corrective actions.
The World Wide Web, called WWW or more simply the Web, is a set of servers interconnected via a protocol named IP (Internetworking Protocol). The underlying IP network is usually called the Internet. The web protocol, technically termed HTTP (Hypertext Transfer Protocol) has created a historical breakthrough: previously good knowledge of the IP protocol suite was needed to enter the right commands to be able to have access to a server. With HTTP, the user uses a graphical interface called a web browser. By simply clicking with a mouse on one or multiple selection menus called forms, the user is able to get information from a web server. The web browser program performs by itself all the necessary operations for that, i.e., connection to the web server, sending of the request, decoding of the response and display on the graphical interface. On request, a web server may deliver different kinds of data, for example:
The web server is also called HTTPD (HTTP Daemon--in the world of computing a "Daemon" is a program which is able to perform tasks that an "ordinary" program is not allowed to do). A less known HTTPD feature allows the user to request the execution of a program instead of the delivery of a document (a "web page"). In order to perform their job properly, the programs to be executed on the server side have to be written according to some specifications, in particular, according to the CGI (Common Gateway Interface) specifications. The programs written in accordance with these specifications are called CGI programs.
Additional information about network management can be found in the publication entitled "Network Management--a practical perspective" by Allan Leinwand & Karen Fang published by Addison-Wesley Publishing Company, 1993. Additional information about the Web can be found in the IBM publication "Accessing the Internet"--International Technical Support Centers, August 1995, SG24-2597-00. Other sources of information about the Web are online on Web sites. For instance the National Computer Science Application Web site: "http.//hoohoo.ncsa.uiuc.edu" gives detailed information about HTML, HTTP and CGI.
Traditionally, in communication networks, maintenance and troubleshooting functions are performed by using specialized hardware and software platforms. These platforms reside in each node of the communication network and/or in dedicated equipment (the network manager).
In node based maintenance and troubleshooting, the maintenance and troubleshooting of the communication network are made by operating on the faulty network node without having the view of the complete communication network.
An example is the PETC (Product Engineering Tool written in C language). The PETC is used inside communication nodes such as the IBM 2220 Nways Broadband Switch from International Business Machines Corporation (IBM Corp.). The PETC works according to the client/server paradigm:
Both are located in a NAS (Node Administration Station) which is a personal computer component of each node. The PETC Server uses the same internal protocol as the NAS to communicate with the line adapters of the node. A remote PETC Client can gain access to a local PETC Server using the TCP/IP protocol.
Another way to manage a network node is to use a combination of:
In fact, in most network architectures, a node agent is present in each network node to handle the CMIP or SNMP protocol for network management purposes. This agent is also used for sending a limited set of commands to the line adapters of the node and receiving the corresponding responses. As previously with the PETC, the destination node agent may also be controlled by a remote FEUI residing in another node using the TCP/IP protocol.
In network based maintenance and troubleshooting, the communication network is managed by the network manager from one or several consoles. Commands are sent to the nodes using a standard protocol such as CMIP (Common Management Information Protocol) or SNMP (Simple Network Management Protocol). Special application programs are developed for each type of network node running under a common program like IBM's Netview network management program providing basic functions such as the user interface, the interface with the operating system, etc.
A recent alternative consists in providing the end user with a web browser like any of those currently used on the public Internet network. Two solutions can be considered:
A first solution described in FIG. 7, uses a web server on each network node. Due to memory and processing power constraints inside the node, this web server cannot implement all the functions of a real Internet web server and is often located in a ROM. The product "WebManage" from Tribe Computer Works Corporation is an example of such an implementation. FIG. 7 shows three network nodes 701, 702 and 703 communicating together through the communication lines 704, 705 and 706 within the same communication network 716. A web server 707, 708 and 709 is located inside each node. An operator 714 using a web browser 710 has access by means of the public Internet or an Intranet network 715 to any of the web servers by using an HTTP communication 711, 712 or 713.
A second solution, described in FIG. 8, uses a centralized web server. Three network nodes 801, 802, and 803 communicate together through communication lines 804, 805 and 806 within the same communication network 812. A centralized web server 807 generates SNMP messages towards the network nodes as a classical network manager does. An operator 813 using a web browser 811 can have access to the web server 807. The web browser 811 sends HTTP messages 815 to the web server. These messages are converted to SNMP requests by the web server and forwarded 808, 809 or 810 to the destination node. Messages are then processed by the SNMP agent 816, 817 or 818 of the node.
Network management protocols like SNMP are sufficient to isolate a faulty component such as a hardware card or a piece of software. However, to understand the reason for the failure or to perform repair actions, general purpose protocols are not always adequate. Some communication networks may have special requirements which are not easy to implement in a general purpose product. For instance, in large networks, the network manager is scattered across several computers, each computer managing a sub-network. There is no way to have access to a node from a unique console. For troubleshooting, the network management should be done from any console connected to the network, regardless of its hardware and operating system. Even a dumb terminal should be qualified.
Network equipment manufacturers resort to solutions more or less similar to PETC or FEUI. A specialized solution like the PETC is not user friendly since the user interface is in line mode. This implies that the user has to remember all the available commands with their associated parameters and to enter them via the keyboard. Moreover, the navigation from a network node to another is cumbersome. The navigation is platform dependent and requires proprietary software on the client side to be connected to the server. The server uses the same path as the node agent to communicate with the line adapters and the response time may be long in case of heavy traffic. Furthermore the communication with the adapters is lost in case of a severe problem, at a time when the tool is most needed.
A solution like the FEUI offers only a limited set of functions--the ones provided by the node agent--and the response time may be long when the node agent is heavily loaded, which often occurs in case of node failure. A proprietary software program must be installed on the end user side to remotely operate the user interface.
One could be tempted to install a web server inside each target equipment. This means that the HTTP protocol would replace the specialized protocols like SNMP or CMIP. But HTTP is a protocol built to deliver documents (i.e. text files or images) with a nice and user friendly presentation. Having a program running inside a line adapter of a network node dealing with presentation and languages implies different versions of line adapters according to the supported languages. It also means upgrade of all the network node's hardware if something is changed in the presentation of the results. Delivering a large amount of data such as an image could also seriously impair the overall performance of a line adapter. Moreover, communication network nodes operate in real time and are embedded systems: they have only minimal hardware resources available in terms of memory and computing power. A complete web server requires a large amount of those resources, a "light server" would be useful but since it would not implement the complete protocol, it would not be possible to connect to this "light server" with a standard "off the shelf" web browser. It is probably not reasonable to use the HTTP protocol to perform every network management function for the following reasons: