This application claims priority under 35 U.S.C. xc2xa7xc2xa7119 and/or 365 to 19995974 filed in Norway on Dec. 3, 1999; the entire content of which is hereby incorporated by reference.
The present invention relates to performance analysis of data networks.
In order to follow up on network performance, a network manager needs to monitor the network""s vital parameters. He needs to know if any part of the network is congested at any time, or if the traffic growth in a part of the network will necessitate action in the immediate future to increase the capacity, restructure the network or modify the routing in order to avoid future perturbances in the network operation. If parts of the network are over-provisioned, he might want to reduce the capacity of certain links in order to reduce network operating costs.
Performance data can be used to detect problems that arise during network operation, or to detect trends in the network so that corrective action can be taken before a problem develops.
The data collected from the network can be exploited either manually by an operator, or automatically by report generators, correlation tools and even systems that can automatically respond to anomalies in the network by actively doing a reconfiguration to solve a detected problem.
Network elements maintain counters that can be used to get a picture of traffic, error rates etc. at that specific point in the network. The values of such counters can be retrieved in a variety of manners, depending on the capabilities of the network element, and the kind of management functions (protocols) that it supports.
The most common datacom management protocol is SNMP (Simple Network Management Protocol). This protocol supports retrieval of specific data objects from a network element in a query/reply fashion. Other alternative protocols are FTP or Telnet.
It is often desirable to apply mathematical functions to data objects retrieved from network elements, or somehow compare the values of these data objects. This makes sense only if the values that participate in the calculation (or the comparison) are sampled at the same time.
As an example of such a calculation, let""s for example assume that we retrieve the objects ifOutOctets (number of octets sent on the interface), ifInOctets (number of octets received on the interface) and ifSpeed (interface transmission rate in bits per second) from a network element, for a specific half-duplex interface, and we wish to calculate Bandwidth Utilisation (BU) as the ratio between traffic (ifOutOctets+ifInOctets), and the available bandwith (ifSpeed).
BU=(ifOutOctets+ifInOctets)*8/ifSpeed
Especially since a lot of the data retrieval is done using the query/reply paradigm, we cannot assume that samples for several data objects can be retrieved from network elements simultaneously. If the values used for ifOutOctets and ifInOctets in the above expression are not sampled simultaneously, the result might turn out to be significantly wrong.
There are several reasons why we cannot assume that multiple objects can be retrieved simultaneously. First of all this would result in clusters of data retrievals around specific times. The computer system may not have the capacity to process that amount of data with an acceptable delay, and these bursts of network traffic may result in traffic delays and even network congestion. In the other end, the network element needs to respond virtually instantaneously to all the queries, which requires a sufficient amount of processing power which should be used to the network elements primary task, i.e. forwarding data.
What we really might want is to define measurements starting at a specific time, with a specific sampling frequency. However, by doing so we will get sampling clusters since a large number of the measurements will inevitably be defined to start at the same minute past the hour, with the same sampling frequency. It would therefore be better to randomise the start of the measurements within some reasonable interval.
Due to the inability to read and reset a timer simultaneously (except when this function is supported by the timer hardware), and that multiple timers are emulated in software, the actual measurement intervals will skew some fractions of a second in time for each period.
Another problem that has to be addressed is that one might want to compare to data objects sampled at different intervals.
HP OpenView Network Node Manager is a network management tool, providing in-depth views of the network in a graphical format. The tool discovers network devices and provides a map of the network. The map indicates which devices and network segments are healthy and which areas need attention, e.g. if a device fails the Network Node Manager evaluates the event stream and pinpoints the cause of the failure. The Network Node Manager also includes an SNMP data collector that can be configured to retrieve data from network elements at specified time intervals, and provides graphing utilities for browsing both old data as well as incoming data in real time.
However, to our knowledge, HP Openview snmpcollect/xnmgraph does not perform time normalisation as described in this document.
The fundamental idea presented in this document is to use linear interpolation to calculate measurement values for an arbitrary time, independently of the time the measurement was started, and the sampling frequency. This process is what we will refer to as time normalisation.
We present a set of computation stages and techniques to provide a performance analysis tool with performance data collected from the network, that are suitable for ulterior analysis, or data computationally derived from data collected from the network.
The exact scope of the invention is as defined in the appended patent claims.