1. Field of the Invention
The present invention relates to computer networks, and, more particularly, to a method and apparatus for estimating a network performance metric, such as delay and delay-jitter (jitter), between a collection of pairs of network elements (e.g., routers) in a network.
2. Description of the Related Art
As computer networks grow, in terms of the number of network elements (e.g., routers) contained therein, the measurement of network performance metrics becomes of increasing importance. By measuring such metrics, network parameters can be tuned in order to provide optimal performance. Moreover, the network's architecture can be adjusted and growth planned to allow the network to grow in a controllable fashion. One such metric is the delay experienced by data packets flowing between certain of a network's routers (i.e., travel time between the routers). Another is the jitter, or deviation in delay, experienced by such data packets. Thus, there is a growing need to continuously monitor network delay and jitter between multiple pairs of routers in a network such as an enterprise or service-provider network. In service-provider networks, particularly, such performance monitoring is needed in order to verify service-level agreements between a service provider and customers.
Unfortunately, current methods of monitoring are not as useful as might be desired. For example, one current method for monitoring network delay and jitter requires the measurement of delay and jitter between every specified pair of routers by exchanging probe packets between routers. As will be apparent to one of skill in the art, the number of pairs of routers that need to be monitored in such a scenario grows as a quadratic of N, where N is the number of network routers making up the network. Thus, such a measurement technique involves measurements on the order of N2 (O(N2)).
Once generated, the measurement data is collected and processed. The measurement data can then be made available to other applications, such as for the tuning of network parameters and the like. As can be seen from the complexity of the technique (O(N2)), this measurement scheme does not scale well in large networks as the number of specified pairs of routers to be monitored increases dramatically. In such cases, the resulting network traffic due to probe packets can be large and, therefore, unsustainable as a result of the bandwidth consumed thereby.
This problem of scalability may be further compounded by the fact that networks being deployed are diff-serv (DS) enabled. In such cases, delay and jitter characteristics must be monitored for every DS-codepoint in use in the network. Diff-serv enabled networks offer a range of data transfer services that are differentiated on the basis of performance experienced by packets belonging to a particular aggregated set of applications or flows. An application requests a specific level of performance on a packet-by-packet basis, by marking the type-of-service (ToS) field in each IP packet with a specific value, also called DS-codepoint. This value effectively specifies how an enterprise network or a service provider network processes and forwards each packet along each hop in the network.
Moreover, it is often important which routers and links are involved in the measurements that are made. For example, it is usually preferable to make the necessary measurements from a smaller number of routers. This simplifies the administration of the network measurement tools because there are fewer installations, and fewer instances of the tools to run. Additionally, it is often desirable to avoid sending measurement traffic over already-congested links. Preferably, such traffic is sent over links having at least a modicum of excess bandwidth, so as not to interfere with actual network traffic. Meeting these and other such objectives simplifies network administration by minimizing the need for user interaction and a network's administrative burden.
What is therefore needed is a method and apparatus for the measurement of delays encountered by network traffic in traversing a network, the complexity of which preferably grows at a rate less than O(N2). More preferably, the complexity of the measurement scheme should grow at a rate that increases at most linearly with the number of network routers (designated herein as N) and the number of network links (designated herein as M). Moreover, such a technique should address the situation in which an operator desires some modicum of control over the nodes and links over which measurements are made.