1. Field of the Invention
The present invention relates to network management systems and in particular relates to methods and associated systems that dynamically measure the ability of a computer system to support data collection for analysis by network management system applications.
2. Discussion of Related Art
It is common for computer systems to be linked via communication media for exchange of information and to distribute computational tasks among a plurality of such interconnected computer systems. Such collections of interconnected computer systems and/or the communication paths that interconnect them are often referred to as networks. Computing tasks may be distributed over the network of computer systems by application of distributed computing techniques known in the art. Data and computing resources such as storage devices, printers, and other devices may be shared over such networks so that a user at any of the computing systems of the network perceives that he/she is locally attached to the shared resources and data.
Use of networks has grown rapidly as computing applications have evolved. The Internet is an example of a network that has grown to worldwide proportions with millions of users and their computer systems interconnected at any given moment. Other networks are contained within a single organizationxe2x80x94an enterprise. For example, a single corporation may use a network to connect all elements of their business. Such an enterprise-wide network would connect all systems in a building and may extend to other buildings and other remote sites even around the world. Though the Internet is not centrally managed by a single entity, enterprise computing environments are typically managed by a single central entityxe2x80x94a network managerxe2x80x94or at most a relatively small group of network managers cooperating to manage the computing resources of the enterprise through the network.
Small networks (i.e., a few computing systems connected within the same office area) are easily managed by a network manager by physically visiting each computing system to monitor performance, diagnose problems, configure systems, etc. However, large networks (i.e., a large number of computing systems physically dispersed with respect to one another) present logistical problems for network managers. It is difficult or impossible to manage such networks where physical presence of a network manager is required at each computing system of the enterprise.
Network managers in such large computing enterprises often use special purpose computer programs to monitor operation and performance of the enterprise network. Such programs are often referred to as network management systems (or NMS).
An NMS is a combination of hardware and software used to administer various aspects of network operation by controlling the configuration of equipment used in the network""s infrastructure. Most NMSs are operable on a computing system (node) of the network to gather data regarding performance of the network and/or to diagnose operation of the network. Such NMSs are generally a collection of one or more computer application programs that exchange messages with other computing systems and communication devices (nodes) on the enterprise network to achieve the desired data acquisition and diagnostics. In all but the most trivial networks an NMS also performs critical data consolidation and filtering functions that allow the network manager to focus on the information of the greatest importance to him: data that reveals existing or pending problems with the network""s performance.
Because an NMS is critical to the operation of most networks many NMSs consist of one or more software packages that execute on dedicated host computer systems, or workstations. In smaller networks an NMS may consist of software executing on a workstation that is also used for other business application purposes. Due to the heavier load of managing larger networks, NMSs on larger networks tend to run on one or more workstations dedicated to the NMS operations.
NMSs usually include component applications that execute periodically or even continuously to monitor various aspects of network performance over time. Such NMS applications might do things like periodically check to ensure that all portions of the network are operable or collect network data that reveals which nodes utilize the network most heavily. If the data collected by these NMS applications is to accurately reflect the state of the network over time, each NMS application must be able to execute its functions completely and without restriction.
Yet such NMS applications compete with one another, other network management software, and even completely unrelated software programs for the limited resources of the host computer system on which they execute.
Every workstation""s physical memory and processing capacity are limited and therefore valuable resources. Further, the workstation""s operating system restricts other resources needed to collect network management data such as virtual memory (secondary storage capacity used to extend the effective size of main memory), file descriptors (data structures/objects used by application programs to manipulate files stored on persistent storage of the computing system), and network input/output (the system""s capacity to handle network traffic).
Generally speaking, assuming a given NMS package, the amount of each resource needed to effectively monitor a network increases roughly in proportion with the size of the network. The inevitable outcome faced by a network manager with a growing network is that the network outstrips the monitoring capabilities of their NMS as operable on a particular computing system. This problem is increasingly common as networks evolve from simple topologies (i.e., a single shared communication path such as a single Ethernet segment) to more complex ones such as micro-segmented (desktop-switched) topologies.
In such a situation the network manager must invest more money to maintain an NMS that provides adequate control of the network. One option is to purchase a new NMS package that can better handle large topologies; this can require the network manager to conduct a time-consuming survey and evaluation of available products, expend several thousand dollars, and endure a steep learning curve to realize the benefits of the new software. Another option is for the network manager to keep their current network management software and simply upgrade the workstation on which the software executes; this may be less expensive than buying new software but is generally a stopgap solution. Eventually, the enterprise network will grow to a point where the NMS capability will require more resources than presently available.
Some present NMS applications utilize configuration parameters and models of the systems on which they run to limit their demands for resources. Some NMS applications, for example, inspect certain statically configured aspects of their host system and adjust their behavior accordingly. These inspections usually involve some combination of system parameters such as the type and configuration of network media and protocols, the type and speed of network interface adapters, the central processor type and speed, and the system memory. Often these parameters are inspected upon installation or initial configuration of the NMS application or perhaps inspected once when the NMS application is first started. However, it is impractical and imprecise to inspect all of the applicable settings on a host at a given time in order to project the system resources that will be available to an individual network management application during its subsequent execution. Even if all of the statically configured aspects of the system that affect NMS application performance were examined and their cumulative effect accounted for, it is impossible to predict the impact that other applications executing on the same host will have on the network management application""s interaction with the system.
In view of the above discussion, it is clear that a need exists for improved methods and associated systems for adapting an NMS to the computing environment in which it is operable.
The present invention solves the above and other problems, thereby advancing the state of the useful arts, by providing methods and associated systems for dynamically measuring parameters of the computing environment in which it is operating to determine the computing resources available at any desired time. More specifically, the present invention periodically gathers data regarding the computing system on which it operates and dynamically adapts its functioning to the presently available computing resources.
More specifically, the present invention enables NMS applications to offer a third option to network managers in this situation, an option in which the existing NMS adjusts its behavior to continue collecting data from the network as the network grows, and hence the resource requirements of the NMS on its host system grows. The invention allows NMS component applications to determine the maximum rate at which they can transmit and receive network traffic on a given host computer system (the system on which the NMS application is operating). This rate will be limited to some maximum value that is reached when such an NMS application consumes all of one or more system resources that affect its operation. The nature and number of resources exhausted by the NMS application are unimportant; the key is that one or more resources become over subscribed (over utilized) and the performance of the NMS application is bounded as a result. Once the maximum rate of information exchange has been established an NMS application can use the rate so determined for a number of purposes. A most important use is to optimize and adapt its behavior for the particular environment of its NMS.
Still more specifically, this invention dynamically determines a level of performance of the NMS application which over utilizes some resource of its host system. Determining this threshold level is accomplished by the exchange of traffic protocol data units (PDUs) with other nodes in the network. The other nodes in the network involved in this determination must support the exchange of the PDUs. The NMS application is supplied with a list of the nodes from which network management data could potentially be collected; this list should preferably contain multiple nodes. The nodes identified must be capable of responding to directed request protocol data units (PDUs) from the NMS component application, either with a corresponding single directed response or a stream of directed PDUs as appropriate.
Each PDU includes, at a minimum, a sequence number identifying sequential instances of the PDU. In accordance with the invention, the NMS application generates a stream of such PDUs to all nodes from which management data may be collected. Each PDU in the stream is identified by, at least, a sequence number. The nodes essentially echo the PDU so received back to the generating host system of the NMS application. The NMS application therefore receives the echoed PDU responses from all nodes to which the PDUs were directed. If the system resources of the NMS application""s host system are not over subscribed, all PDUs will be received in proper sequence by the NMS application as determined by the sequence numbers in each PDU. In order to gauge its maximum rate of traffic exchange with other nodes of the network, referred to herein as the PDU saturation point, the NMS application initiates a test that consists of a simultaneous exchange of traffic with as many nodes from its list as possible. The purpose of this test is to generate a relatively steady PDU flow between the NMS application and other nodes so that the NMS application""s capacity to process network traffic is exceeded. The frequency of the steady flow is increased until this capacity is exceeded. The NMS application will view this information flow as an aggregated number of inbound and outbound PDUs that eventually reaches the PDU saturation point, which will be computed as an average number of PDUs processed per unit time during the test period.
As mentioned previously, though many present NMS applications exist in today""s market, none empirically measure the capabilities of the host computer system that they execute on for any purpose. Embedding such an algorithm into NMS packages provides NMS applications with information that they can use in a variety of ways. The resulting measure will reduce the limitations placed on the network management software by all of the host""s various static attributes (such as memory and disk configuration, CPU capability, and networking stack) and the system""s dynamic attributes (primarily the amount of each physical resource available to the network management software given the other applications running on the host) to a single dynamically ascertained number that represents the most important metric of all to most network-focused software modules: the maximum rate at which PDUs may be exchanged between the NMS application and nodes on the network.
The most obvious and direct application of the maximum rate at which a NMS package may exchange PDUs with other nodes in the network is to optimize the NMS software""s performance given the resources that are available to it on a particular host system by governing the number of PDUs exchanged by the NMS application and other nodes. This dynamic method of tailoring software behavior to produce a specific level of resource consumption for a given host system contrasts sharply with the approach employed by prior NMS applications. Most current NMS applications are incognizant of whether they are running on a dedicated workstation or one shared with other enterprise applications. They therefore attempt to execute their algorithms using the xe2x80x9cbrute forcexe2x80x9d method without regard to the capabilities of the host system or the impact that their actions may have on the other applications that share the host. Those NMS applications that do inspect certain statically configured aspects of their host system and adjust their behavior accordingly all suffer from the same shortcoming: it is impractical and imprecise to inspect all of the applicable settings on a host in order to project the system resources that will be available to an individual NMS application during its subsequent execution. By contrast, the present invention provides a measure that all NMS applications could use to behave as xe2x80x9cgood citizenxe2x80x9d applications on every host where they execute, by using only those system resources that are not required by other applications residing on the same host.
In a first aspect of the invention, the invention provides for a method for adapting operation of a network management system application to the host system capacity on which it operates by determining a protocol data unit exchange saturation point of the host system and adapting operation of the NMS application so as to not exchange protocol data units with other nodes of a network in excess of the determined saturation point. The saturation point is determined by periodically exchanging protocol data units with all of other nodes substantially simultaneously. The saturation point is then determined by detecting loss of protocol data units returned to the host system from at least one of the other nodes. The frequency of the exchange is used to determine the saturation point. The method therefore periodically increases the frequency of the protocol data unit exchange until a loss of returned data is detected.
Another aspect of the present invention provides for an adaptable network management application system operable on a host system of a network which includes a protocol data unit exchange monitor that detects a saturation point in protocol data unit exchange between the NMS application and other nodes of the network and an operation adapter that adapts NMS application so as to not exchange protocol data units with other nodes of the network in excess of the saturation point. The system includes a transceiver element to exchange protocol data units with all other nodes substantially simultaneously and a protocol data unit loss detector to detect loss of protocol data units returned to the host system from at least one other node. The loss detector then determines the saturation point from the frequency of the periodic exchange. The monitor increases the frequency of the periodic exchange until loss of protocol data units is detected.
The above, and other features, aspects and advantages of the present invention will become apparent from the following descriptions and attached drawings.