Bandwidth measurements in computer networks include measurements of available bandwidth, bottleneck bandwidth, and link bandwidth. In the lexicon of such measurements, “peak bandwidth” usually refers to the maximum throughput theoretically achievable along any path at any time, while “available bandwidth” refers to the maximum throughput possible along a path under current network conditions. “Link bandwidth” measurement refers to measurements of bandwidth along each link in a given communication path to a destination.
Tools such as “pathchar” (see, V. Jacobsen, “pathchar—a tool to infer characteristics of Internet paths”, in Presented at Mathematical Sciences Research Institute (MSRI), April 1997), “pchar” (see, B. A. Mah, “pchar”, available at http://www.employees. org/˜bmah/Software/pchar/, June 2001), “clink” (see, A. B. Downey, “Using pathchar to estimate Internet link characteristics”, ACM Sigcomm, August 1999), and “nettimer” (see, K. Lai and M. Baker, “Measuring link bandwidths using a deterministic model of packet delay”, ACM Sigcomm 2000, August 2000) use the variations of the observed one-way delay with increased packet size to measure the link bandwidth. After measuring link-specific bandwidth(s), it is easy to find the bottleneck bandwidth. However, this method is not attractive for measuring only the bottleneck bandwidth or available bandwidth because the probing consumes a substantial amount of bandwidth and most of the information obtained would be redundant. Also, the efficacy of such methods for estimating link bandwidths decreases with increasing path lengths.
The classic packet-pair technique (see, V. Jacobsen, “Congestion avoidance and control”, ACM Sigcomm, August 1998) is mainly used in different forms for measuring bottleneck bandwidth. Examples of such use may be found in V. Paxson, “End-to-end Internet packet dynamics”, IEEE/ACM Transactions on Networking, 1993; R. L. Carter and M. Crovella, “Dynamic server selection using bandwidth probing in wide area networks”, BU-CS-96-007, March 1996; C. Dovrolis et al., “What do packet dispersion techniques measure?”, IEEE Infocom, April 2001; and J. C. Bolot, “Characterizing end-to-end packet delay and loss in the Internet”, Journal of High Speed Networks, 1993. The fundamental idea behind using the packet-pair technique is that two packets sent between a sender node and a receiver node at a rate higher than the bottleneck bandwidth will be spread out in time at the bottleneck by the transmission delay of the first packet and the spacing will remain unaltered after the bottleneck. If the time spacing between the arrival of the last bit of the first packet and the last bit of the second packet is tb at the receiver, then the bottleneck bandwidth will be b/tb, where b is the size of the second packet. Variation of the value of tb can lead to the estimation of the available bandwidth, but if the goal of the experiment is to estimate the bottleneck bandwidth, the effect of noise has to be removed.
Several issues have to be addressed for practical implementation of the packet-pair technique. As pointed out by Dovrolis et al., devising a correct method to detect the bottleneck bandwidth has become challenging over the years, primarily because the bottleneck bandwidth is no longer a standard value (e.g., 56 kbps for modems, ISDN links or T1 links) and can take on any value up to the physical capacity of the links in the communication path.
If the sending rate is smaller than the bottleneck bandwidth, then the packets may not get queued up at the bottleneck link and the spacing of the consecutive packets will reflect the sending rate, rather than the bottleneck bandwidth. In such a case, the size of the packets can be made larger to force the packets to queue up at the bottleneck. The disadvantage of this scheme is that the probability of arrival of cross traffic at the bottleneck bandwidth during the transmission of the first probe packet increases as the packet size gets larger and, hence, the spacing between consecutive probe packets will be longer, indicating (falsely) a lower estimated bandwidth.
If packets get dropped or re-ordered, no result can be obtained and also there is no exact method to remove these effects. Increasing the number of samples, or varying the size of the probe packets can only minimize the effect.
One basic assumption behind using the packet pair technique for correct estimation of the bottleneck bandwidth is that the probe packets should get queued one after another at the bottleneck. However, if competing traffic is present, then this may not always be true. So, statistical measures need to be devised, as provided by Carter and Crovella and Dovrolis et al., to remove the effects of noise due to competing traffic and estimate the correct value.
Paxson has noticed that consecutive probe packets may not follow the same path in ISDN links due to the use of multiple channels, leading to incorrect estimates. This can also happen due to load balancing or route changes, where the assumption that the path followed by consecutive probe packets is the same becomes invalid.
By sending a set of packets with the size of the set being one bigger than the number of multiple links, the effects due to multi-channel links can be addressed. This technique, called Packet Bunch Mode (PBM) has been proposed by Paxson. As the route changes may not be frequent, outliers created by itself may not be persisting and hence good statistical measures can remove the effect of this. The effect of load balancing can also be addressed using the PBM technique.
There is an unfortunate side effect of using PBM or using packet trains. Dovrolis et al., have found that increasing packet trains can lead to under-estimation of capacity. This is because packet trains undergo additional dispersion (spacing) than packet pairs, due to presence of cross traffic.
When ΔTb (time interval between consecutive probe packets at the bottleneck) is altered in the links after the bottleneck bandwidth, it will result in erroneous estimations that can happen due to the following reasons:                1. Asymmetric path/links: When probe packets are sent by a source and echoed by the receiver, the spacing between received packets at the sender may not always reflect the spacing on the forward path. The bottleneck bandwidth in the reverse path can be different from that in the forward path and this can happen due to asymmetric path or links like ADSL and satellite links. Also Acknowledgement (ACK) compression (see, L. Zhang et al., “Observations on the dynamics of a congestion control algorithm: The effects of two way traffic”, ACM Sigcomm, September 1991) and processing delay at the receiver before echoing probe packets can further distort the spacing. Therefore, a receiver-based scheme where the spacing between probe packets is measured at the receiver would tend to be more accurate.        2. Congestion in downstream nodes: This can lead to under-estimation of bottleneck bandwidth when packets get delayed further or over-estimation due to timing compression (see, Paxson). Dovrolis et al. have indicated that the later effect becomes more significant when the size of the probe packets is small.        
Statistical methodologies combined with the packet train approach can effectively minimize these effects.
If the bottleneck bandwidth is too large to be measured using the system clock, then the estimate will be not be correct for high bandwidth values. A solution is to send a bunch of packets so that total time spacing will be greater than clock resolution. This effect can also be addressed using probe packets of larger size. Both solutions will incur noise due to interfering traffic at the bottleneck link.
Due to changes in infrastructure or routing table changes, bottleneck bandwidth may change. However, as these effects will not be persistent, statistical measures can easily detect this.
User level stamping can produce over-estimations, when the kernel delivers packet back to back to the application layer.
Dovrolis et al. and Carter and Crovella have shown that the histogram of bottleneck bandwidth can lead to multiple modes and some local modes, which are more dependent on the cross traffic, are stronger than the mode for the bottleneck link. This is true even if the assumption is that there exists a single channel between the sender and the receiver. Paxson has attributed the multiple modes present in the observations to the change of the bottleneck link speed during the probing period or presence of multiple channels. However the interpretation of the data based on this assumption alone will be erroneous given that multiple modes occur due to presence of cross traffic. Dovrolis et al. and Carter and Crovella both have used filtering techniques to eliminate wrong modes.
Bprobe (see, Bolot, supra and B. Carter, “bprobe and cprobe”, available at http://cs-people.bu .edu/carter/tools/Tools.html) estimates the maximum possible bandwidth along a given path and cprobe estimates the current congestion along a path. Currently these tools rely on two features of the IRIX operating system for SGI hardware:                A high precision timer which provides finer granularity timing of the probe packets. Specifically, where the usual timer resolution of a system clock is tens of milliseconds, these tools are based on an SGI memory-mapped device having a resolution of 40 nanoseconds.        The ability to change the priority of the process to facilitate accurate timing such that the measurement process does not become context-switched out while measuring.        
Bprobe uses filtering to take care of underestimated and overestimated values. Their approach is based on simple union or intersection of different estimates obtained in the simulations. The union and intersection are done with different sets of measurements, with each set consisting of varying sized probe packets. The intersection filtering tries to find the intersection of the sets i.e., the estimate that occurs in all sets. The union filtering method combines overlapping intervals and selects an interval as the final one if enough sets contribute to it.
Pathrate (see, Dovrolis et al., supra and C. Dovrolis, “pathrate: A measurement tool for the capacity of network paths”, available at http://www.cis/udel/edu/˜dovrolis/bwmeter.html (July 2001)) is a receiver-based tool that tries to find the mode corresponding to the capacity mode of the path (i.e. the bottleneck bandwidth value). pathrate uses UDP packets for probing the path's bandwidth, and it also establishes a TCP connection between the two hosts for control purposes. The following features have been utilized by Dovrolis et al. to eliminate unwanted modes:                With small packet trains (length=2 implies packet pair) some modes higher than the capacity mode appears.        When longer trains of packet are sent for estimation, it leads to modes less than the capacity mode and these local modes are termed as Sub-Capacity Dispersion Range (SCDR). (Because longer packet trains experience more cross traffic, under-estimation occurs).        When the packet trains are very long, the distribution becomes unimodal and the mode corresponding to that is SCDR and the mode does not change with variation of length of packet train.        
Dovrolis et al. have used these observations to come up with the determination of Capacity mode and have implemented their technique in pathrate. Pathrate gives accurate results and the level of accuracy is dependent on the resolution of bandwidth measurements. This work is quite robust, in the sense it deals with measures to take care of cross traffic unlike bprobe whose solution does not utilize any property of variation of the observed values due to the presence of cross traffic.
However there are several issues in using pathrate for actual measurements, namely:                1. It is important to run pathrate from relatively idle hosts. It should not be run if CPU or I/O intensive processes are running because they will interact with pathrate's user-level packet time-stamping, and the results obtained will not be accurate. If pathrate is running in a machine devoted to significant processing, it will steal many CPU cycles from the other important processes that are the most CPU and I/O intensive. This implies a separate module on the same LAN would be ideal choice for measurements so as to offload the router.        2. For heavily loaded paths, pathrate can take a long time (about 30 minutes) until it reports a final estimate.        3. Pathrate is a receiver-based scheme that implies that it cannot be utilized for measurement outside the network where the senders and the receivers do not cooperate.        
Bottleneck bandwidth gives the capacity of the path, i.e., the maximum bandwidth achievable in the absence of cross traffic, while the available bandwidth is the maximum throughput that can be obtained, given the current network conditions.
Assuming rate-allocating servers (RAS) (see, S. Keshav, “A control-theoretic approach to flow control”, ACM Sigcomm, September 1991), the packet pair technique can give an idea regarding the fair share of bandwidth or the available bandwidth. However, the queues in the Internet are mainly FCFS servers. Accordingly the packet pair technique will not be useful for measuring available bandwidth.
Some of the methods used in the previous art for determining the available bandwidth are the following:                1. cprobe, a tool developed by Carter and Crovella that calculates the time taken to transfer a packet train of eight packets and uses that value along with the total number of bytes transfered to determine the available bandwidth. For that the bottleneck bandwidth needs to be determined first, so that the sending rate at the sender is greater than the bottleneck bandwidth.        2. ssthresh variable in TCP's slow-start phase, which should ideally be set to the product of the connection's RTT with the available bandwidth, can be determined from the dispersion of the first three or four ACKs (see Dovrolis et al.).        3. Based on the idea that the variation of end-to-end delay of a packet is due to the variation of queuing at the intermediate routers, Paxson has used the variation of one-way transit time (OTT) to estimate the available bandwidth.        
The basic assumption behind the first two methods is that the packet dispersion suffered by long packet trains is inversely proportional to the available bandwidth. Through a model of single link network, Dovrolis et al. have shown why the dispersion suffered by long packet trains is not proportional to the available bandwidth. Through experiments Dovrolis et al. have also found that any method similar to cprobe will over-estimate the available bandwidth, though no solution was proposed for correctly determining the available bandwidth.
The third method is difficult to implement, as this method will give rise to some important issues, one of which is measurement of OTT that requires a detailed clock synchronization mechanism between the source and the receiver.
NETBLT (Network Block Transfer Protocol) (see, D. D. Clark, M. L. Lambert, L. Zhang, RFC 998 “NETBLT: A Bulk Data Transfer Protocol”) is a transport level protocol that is intended for rapid transfers of large quantities of data between two end points of the Internet. The two end points negotiate on the transmission parameters (burst size, burst interval and number of outstanding buffers) and deliver data on a buffer-by-buffer basis rather via a window-based scheme. However, NETBLT lacks a method for dynamic selection and control of transmission parameters so as to modify the transmission scheme based on congestion in the network. Moreover, buffer-based schemes as used in NETBLT cannot be adopted for real time data transfer (where estimated available bandwidth information would be necessary).