The present invention relates to a distributed computing system, and more particularly to the remote management of network elements in distributed computing systems.
The resources and computation tasks in a computing system are frequently spread among a plurality of network nodes to form a distributed computing system. When centralized resources are shared by a plurality of users in a distributed system, their costs are distributed over a larger user base. In addition, the centralization of shared resources makes the administration and maintenance of these resources more efficient and also potentially more reliable due to the possibility of a centralized backup mechanism. Furthermore, the redundancy provided by most distributed computing environments improves the ability to recover from a failure by allowing processing tasks to continue on an alternate device upon a detected failure.
While the centralization of shared resources potentially makes the administration and maintenance of network elements more efficient and reliable, the increasing diversity of network elements in distributed computing systems provides additional challenges for network management systems that attempt to manage network resources in a uniform manner. Generally, a network management system monitors network activity, allocates network resources, detects and reports faults, and reconfigures the network topology. In order to control the diverse devices of different manufacturers using a uniform set of commands and data format, a standard network management protocol, referred to as the Simple Network Management Protocol (xe2x80x9cSNMPxe2x80x9d), has been developed. For a discussion of the Simple Network Management Protocol, see, for example, Simple Network Management Protocol, Request for Comments No. 1157 (May 1990), available from http://www.cis.ohio-state.edu/htbin/rfc/rfc1157.html.
The SNMP protocol allows network managers to address queries and commands to diverse network elements in a uniform manner. Generally, the SNMP protocol accomplishes network management tasks by using one or more network managers and at least one agent associated with each managed network element. In accordance with the SNMP protocol, an xe2x80x9cagentxe2x80x9d is a component associated with each managed network element, such as a server, host, or network gateway. Each agent stores management data in a managed information base (xe2x80x9cMIBxe2x80x9d) and responds to queries from the network manager in accordance with the SNMP protocol for such management data. The MIB is a structured set of data variables, often referred to as objects, in which each variable represents a resource to be managed. The MIB contains information on the entities managed by the agent, such as the number of packets transferred and error information.
The SNMP protocol specifies a number of commands for communicating management information between the network manager and the agents. For example, the SNMP protocol specifies GetRequest, GetNextRequest, SetRequest, GetResponse and Trap commands. In response to a GetRequest or a GetNextRequest command, an agent will evaluate and retrieve the appropriate management data from the MIB. The agents thereafter return the requested management data with a GetResponse command. A SetRequest command is used by the network manager to instruct one or more agents to specify a value in the MIB. Finally, a Trap command is sent by an agent to the network manager to alert the network manager of the occurrence of a predefined condition.
In order to perform the required network management functions, the network manager must use the SNMP commands to obtain management data regarding the network itself, as well as the elements in the network. FIG. 1 illustrates a conventional master-agent environment, where a master SNMP agent 130 communicates management information to a network manager 120 on behalf of a number of network nodes. The distributed network environment 100 of FIG. 1 includes a number of network nodes 110-112, 160-162 and a network manager 120 interconnected by a network 105, such as a local area network (LAN) or a wide area network (WAN). The network nodes 110-112 may be embodied, for example, as workstations, servers and routers. As shown in FIG. 1, a master SNMP agent 130 residing on network node 110 communicates management information to the network manager 120 on behalf of the node 110, as well as a number of additional managed nodes 160-162 that are managed by the master agent 130. Each managed node 160-162 has an associated SNMP sub-agent 170-172, discussed below. In one illustrative implementation, the network node 110 where the master agent 130 resides can be embodied, for example, as a workstation, and the managed network nodes 160-162 may be embodied, for example, as a facsimile machine, printer or server. In addition, the network manager 120 may communicate directly with SNMP agents 150-151 associated with additional nodes 111-112, respectively, of the network.
Thus, to obtain information regarding the managed nodes 160-162, the network manager 120 communicates only with the master agent 130. The master agent 130, in turn, relays requests for management data to the managed nodes 160-162, collects the requested management data from the MIBs of each managed network node 160-162 and communicates the collected management data to the network manager 120. Thus, the master-agent environment is said to implement a distributed MIB.
Communications between the master agent 130 and the SNMP sub-agents 170-172 associated with each managed network node 160-162 must conform to at least two SNMP protocols. First, the Simple Network Management Protocol Multiplexing (xe2x80x9cSMUXxe2x80x9d) protocol, often referred to as the xe2x80x9cSMUX protocol,xe2x80x9d specifies how each SNMP sub-agent 170-172 must register and deregister with the master SNMP agent 130. The SMUX protocol is described, for example, in Request for Comments No. 1227 (May 1991), available from http://www.cis.ohio-state.edu/htbin/rfc/rfc1227.html. Second, the Simple Network Management Protocol Distributed Programming Interface (xe2x80x9cDPIxe2x80x9d) protocol, often referred to as the xe2x80x9cSNMP DPI protocol,xe2x80x9d specifies how SNMP sub-agents 170-172 communicate with the master agent 130. The SNMP DPI protocol is described, for example, in Request for Comments No. 1592 (March 1994), available from http://www.cis.ohio-state.edu/htbin/rfc/rfc1157.html.
While the master agent configuration has further streamlined the network management process, by allowing the network manager 120 to communicate with fewer entities to obtain necessary network management data, the attendant requirements of the SMUX and SNMP DPI protocols have increased the complexity of network management systems that support a distributed MIB. As apparent from the above-described deficiencies with network management systems that utilize a distributed MIB, a need exists for a network management system that does not require compliance with the SMUX and SNMP DPI protocols. In addition, a further need exists for a network management system that provides uninterrupted SNMP agent service while dynamically modifying the distributed MIB. Finally, a need exists for a network management system that significantly reduces the memory requirements associated with conventional network management systems.
Generally, a method and apparatus are disclosed for remotely managing network elements in a distributed computing system. The disclosed network management system includes one or more network managers, at least one agent associated with each managed network element and a master agent. The master agent communicates management information to the network manager on behalf of the node on which it resides, as well as a number of additional managed nodes. According to one aspect of the invention, the distributed computing system utilizes a software entity, such as an operating system, to manage the location of, and communication with, distributed resources in the distributed computing system. In this manner, the network manager and SNMP agents can interact with distributed elements to obtain desired management information independent of the location of the distributed elements.
According to another aspect of the invention, SNMP MIBs in the distributed computing environment are implemented as hierarchical file systems, comprised of a tree of file-like objects, that may be accessed through a namespace. The SNMP MIB namespace of the present invention allows the network manager and master agent to access each resource, including SNMP MIBs, in a uniform, file-oriented manner. The hierarchical SNMP MIB namespace provides a mechanism for maintaining the relationship between names and entities, and permits the network manager and master agent to locate desired information by means of a pathname.
The SNMP MIB namespace of each network node managed by the master agent can be mounted into the namespace of the master agent to create a distributed MIB namespace. Thus, the master agent can obtain information from the distributed MIB namespace regarding the managed network nodes without regard to the location of the managed network nodes. The MIBs of each managed node is mounted to the MIB namespace of the master agent and appropriate connections through the distributed network are established by the operating system or another software entity.
The present invention is operative in a master-agent environment, where a master SNMP agent communicates management information to a network manager on behalf of a number of managed network nodes, each having an associated SNMP sub-agent. When the master agent receives a request from the network manager for management data regarding a managed network node, the master agent extracts the object identifier of the MIB from the request and maps the object identifier to a corresponding file in the MIB namespace. Thereafter, the master agent can open the file and write the received request to the file, thereby activating a process associated with the file. The activated process reads and executes the request and writes the result to the file. The master agent then transmits the result to the network manager.
A more complete understanding of the present invention, as well as further features and advantages of the present invention, will be obtained by reference to the following detailed description and drawings.