A data communications network may be considered to comprise a mass of network nodes interconnected by data paths.
In packet switched data networks monitoring a data path, or in other words, estimating a condition such as available bandwidth (aka available capacity) end-to-end over a data path of the network is useful in several contexts; including Service Level Agreement (SLA) verification, network monitoring and server selection.
Mainly, there are two ways to estimate link and/or path characteristic(-s) such as available bandwidth, namely passive or active monitoring.
A number of passive methods are known, e.g. in NetFlow and Sflow, each network node samples the packets passing through and then forwards, potentially only a subset of these samples, to a management workstation running a NetFlow or Sflow collector that aggregates the results to present a network-wide view.
Other known measuring methods are known from the documents Prieto, A. G., Stadler, R.: A-GAP: “An Adaptive Protocol for Continuous Network Monitoring with Accuracy Objectives.” IEEE Transactions on Network Service Management, Vol. 4, No. 1, pp. 2-12, 2007; and Prieto, A. G., Stadler, R.: “Monitoring Flow Aggregates with Controllable Accuracy.” Lecture Notes in Computer Science, Vol. 4787, pp. 64-75, 2007.
In the above documents, Stadler et al. proposes a series of algorithms that allow network nodes to exchange device and port-related MIB (Management Information Base) counter values in an efficient manner while calculating a network-wide aggregate value, such as e.g. the top 5 flows in the network. The network nodes, in Stadler et al., are organized in a tree topology overlay, executed on top of the physical links. The root node launches a request for a particular type of network-wide aggregate, and then at each level along the tree the nodes perform required calculation and send the aggregated results from the leafs towards the root node.
The solution proposed by Stadler et al. relies on values that are already known to the network nodes and it is therefore not suitable for performing active measurements. Further, it does not take into account active measurement methods. Further, passive monitoring cannot measure e.g. latency or round-trip time. And centralized collection of performance counters obtained through passive monitoring at the distributed nodes does not scale well in large, dynamic networks.
Passive measurements need to be complemented by active measurements in order to evaluate the latency, round-trip time and packet loss between nodes. Measuring such values hop-by-hop between nodes that are placed further than the network edge require adding dedicated measurement workstations or support for technology-specific protocols such as Label Switched Path (LSP)-ping.
In contrast to passive measurements one can monitor the network utilizing active monitoring or active measurement techniques. Active measurements have been used for evaluating network performance for a long time. Recent advancements include how to measure parameters such as path available capacity. A method for measuring path available capacity utilizing two measurement nodes located on each end of a path is known. The first node injects traffic into the network while the second node evaluates the effects of the network traffic on the packets it receives. The estimation of parameters such as round-trip time, latency or packet loss also requires active measurement methods.
Measuring available end-to-end bandwidth is typically done by active probing of the data path. The available bandwidth can be estimated by transmitting probe traffic, such as User Data Protocol (UDP) probe packets including a train of probe packets into the data path, and then analyzing the observed effects of other data packet communications, here denoted, cross traffic on the probe packets. Typically, large UDP probe packets having a specified inter-packet separation are transmitted. This kind of active measurement requires access to both sender and receiver hosts, referred to as sender and receiver nodes, but does not require access to any intermediate node(s) in the path between the sender and receiver nodes. Conventional approaches to active probing require the transmission of probe packet traffic into the data path of interest at a rate that is sufficient transiently to use all available bandwidth and cause congestion. The desired measure of the available bandwidth is determined based on the increase in delay due to the probe packets having experienced congestion between sender and receiver node. The probe packet rate where the data path delay begins increasing corresponds to the point of congestion, and thus is indicative of the available bandwidth.
Active measurement techniques such as BART (Bandwidth Available in Real Time) are good at providing an end-to-end view of the network, but it cannot help to pinpoint the actual location of a performance problem without installing additional measurement infrastructure into the network.
One problem with the known active measurement methods for estimating network performance is that said methods introduce a lot of data overhead which may have a bad influence on the network performance.