This invention relates to the field of network analysis, and in particular to a system and method for determining a traffic flow between pairs nodes of a network based on the amount of traffic on links (link loads) within the network.
The effective management of a network requires an understanding of traffic patterns within the network. Of particular significance is the traffic flow between pairs of nodes in the network. If it is known, for example, that two nodes exchange large amounts of data, it would be beneficial to provide a wide bandwidth channel between these nodes. Correspondingly, significant network resources need not be allocated to channels between nodes that rarely communicate with each other.
Network analysis tools, such as network simulators, often use the traffic flow between nodes to use the typical or expected behavior of the nodes to facilitate such tasks as network planning, congestion analysis, performance diagnostics, ‘what-if’ analyses, and so on.
Generally, network devices that facilitate the transfer of messages across a network, such as routers and the like, include management/diagnostic reporting functions that are configured to report traffic statistics, such as the amount of data received and/or transmitted by the device. Special purpose devices, such as network ‘sniffers’ and the like, can be configured to collect details regarding the data being transferred, including, for example, the source and destination nodes of the data being transferred, but it is generally infeasible to provide such devices at each node of the network. Therefore, in a typical network environment, the amount of data communicated on links between nodes, herein termed link loads, can generally be obtained from the devices at each node, whereas the details regarding the origination and termination of the data being communicated on the links, herein termed traffic flow, is generally unknown, or only partially known.
A variety of techniques have been proposed for determining traffic flow based on link loads, commonly termed “loads-to-flow” processes. Given the amount of data originated and terminated at each node, the loads on the links between the nodes can generally be determined and/or estimated directly, based on factors such as the bandwidth between nodes, and so on. However, deducing the particular origination and termination of the traffic based on the amount of data flowing into and out of each node is not as straightforward, because it is difficult to distinguish data that merely passes through the node from data that is originated and/or terminated at the node.
Yin Zhang et al. have proposed, in “FAST ACCURATE COMPUTATION OF LARGE-SCALE IP TRAFFIC MATRICES FROM LINK LOADS”, at SIGMETRICS '03, Jun. 10-14, 2003, in San Diego, Calif., for example, the estimation of traffic flow between nodes based on “tomogravity”, which is a combination of tomographic and gravity-based estimation techniques. In Zhang's approach, a node's ‘gravity’ is based on the amount of traffic received at and/or transmitted from each ‘edge’ node, an edge node being defined as a node that is directly coupled to one or more devices that either originates or terminates traffic. That is, nodes that receive and/or transmit a significant amount of data are likely to originate or terminate traffic to and from each other. Zhang acknowledges that such a definition of ‘gravity’ leads to some inconsistencies (“outliers”), particularly at nodes that primarily serve to pass data from one link to another, such as a node used to provide access to a transoceanic channel. Zhang teaches techniques for identifying such outliers and eliminating them from the loads-to-flow determination based on the overall quantity of data received at, and/or transmitted from, each node. Thereafter, tomographic techniques are used to provide consistency among the flow estimates.
Goldschmidt has proposed, in “ISP BACKBONE TRAFFIC INFERENCE METHODS TO SUPPORT TRAFFIC ENGINEERING”, at ISMA 2000, the use of a linear programming model to determine the traffic flow between nodes based on the given constraints (measured parameters) and an objective function that is based on the number of hops between each pair of nodes in the network. The premise of this technique is that nodes that are closely linked (few hops) are more likely to communicate with each other than nodes that are distantly linked (many hops). Although this premise is generally true for ‘engineered networks’ that are designed to effect such close coupling between nodes that commonly communicate with each other, or on ‘geographic networks’ with nodes distributed to link geographic areas, because the amount of traffic between nodes is often correlated to the distance between nodes (e.g. a person is more likely to communicate with a person in the same country than in a distant country), it may not be true for many other networks.
Often, the results of the above algorithms for determining traffic flow between nodes based on link loads are inconsistent with a user's expectations and/or assumptions. For example, a geographic/number-of-hops based algorithm fails to appreciate that the amount of commercial traffic flow between Chicago and New York is likely to be greater than the amount of traffic between Chicago and any of the locales at the nodes forming the links between Chicago and New York. In like manner, a node at a financial institution in New York with offices in Chicago and London may exhibit the same ‘outlier’ characteristics as a node in New York that provides a general-purpose link to London, even though the financial institution in New York may be generating and terminating most of the traffic. On the other hand, a network manager would generally be able to characterize each of the nodes of interest in a network relative to the likelihood of traffic being originated and/or terminated (sourced and/or sunk) at the node.
It would be advantageous to provide a determination of traffic flow between nodes of a network based on a ‘soft’ definition/assignment of gravity measures to nodes of a network, so as to facilitate traffic flow determinations using gravity measures that are based on ‘soft’ data, such as estimates based on demographics, informed guesses, past experiences, and so on, rather than, or in addition to, gravity measures that are algorithmically generated based on ‘hard’ data. It would also be advantageous to provide an interactive user interface that facilitates the input of such soft gravity measures, and the display of the resultant determined traffic flow, as well as facilitating the optional refinement of the gravity measures, based on the determined resultant flow.
These advantages, and others, can be realized by using the defined gravity measures to form ‘objectives’ that are to be optimized within a given set of constraints, rather than as one or more of the constraints that are to be imposed on the solution set. In a preferred embodiment, the determined traffic flow between nodes is constrained so as not to exceed the amount of measured traffic on each link between the nodes, while at the same time optimized to minimize a difference between the specified gravity at each node and the gravity resulting from the determined traffic flow. The specified gravities and measured link loads are used to form a set of linear equations that are processed to effect the optimization defined by the specified gravities, subject to the link load constraints. The determined traffic flows are presented to a user via a graphic user interface, using color and other graphic features to facilitate visualization of the traffic flows.
Throughout the drawings, the same reference numerals indicate similar or corresponding features or functions. The drawings are included for illustrative purposes and are not intended to limit the scope of the invention.