This invention relates to the field of data communication systems, and in particular to a method for determining the physical topology of a network of data communication devices.
Operators of many data communications networks are ignorant of its topology. However, the operators need to know the topology in order to properly manage the network. Accurate diagnosis and correction of many faults requires such knowledge. This is described in the article xe2x80x9cNetwork Diagnosis By Reasoning in Uncertain Nested Evidence Spacesxe2x80x9d, by N. W. Dawes, J. Altoft and B. Pagurek, IEEE Transactions on Communications, February 1995, Vol 43,2-4: pp 466-476.
Network management teams that do know the very recent topology of their network do so by one of three methods: an administrative method, an approximate AI method as described in U.S. Pat. No. 5,727,157 issued Mar. 10, 1998, invented by Orr et at, and PCT publication WO 95/06,989, but assigned to Cabletron, and the Loran traffic method as described in U.S. Pat. No. 5,926,462 issued Jul. 20, 1999 and entitled xe2x80x9cMethod of Determining Topology of A Network of Objects Which compares the Similarity of the Traffic Sequences/Volumes of a Pair of Devicesxe2x80x9d. The data protocols the latter two use are described in the text xe2x80x9cSNMP, SNMPv2 and CMIP. The Practical Guide to Network Management Standardsxe2x80x9d. W. Stallings, Addison-Wesley, 1993 and updates.
The administrative methods require an entirely up to date record of the installation, removal, change in location and connectivity of every network device. Every such change in topology must be logged. These updates are periodically applied to the data base which the operators use to display or examine the network topology. However, in almost all such systems the actual topology available to the operators is usually that of the previous day or previous days, because of the time lag in entering the updates. This method has the advantage that a network device discovery program need not be run to find out what devices exist in the network, but has the disadvantage that it is almost impossible to keep the data base from which the topology is derived both free of error and entirely current.
The Cabletron method theoretically provides only one of the necessary elements for a method of determining network topologies: the deduction of a possible direct or transitive connection. However there are at least six problems with the Cabletron method even with this very limited goal.
1: problems with invalid source addresses and addresses of moved objects. This makes this method unusable under many conditions as it gives contradictory and incorrect results.
2: the requirement that network management reporting by devices be done by the device itself, not by a proxy agent which will reply using a different source address. Although not common, this makes any network with such a device unmappable by this method.
3: requirement that the source addresses of reporting devices appear commonly in network traffic, so that each reporting device has a reasonable chance of picking up the addresses of all the reporting devices it can. This is a major problem. In direct contrast, the only use the method of the present invention makes of the addresses of reporting data-relay devices directly available in tables is to define more up ports when it already knows some (see below).
4: a total inability to deal with the existence of unmanaged devices lying between managed devices.
5: computational complexity in very large networks means the Cabletron method takes so long to run that the network may well have changed before the calculations are complete.
6: the inability to deal with multiple connections between devices, for example between a switch and a segmented repeater.
The approximate AI methods use routing/bridging information available in various types of devices (eg: data routers contain routing tables). This routing information carries a mixture of direct information about directly connected devices and indirect information. The AI methods attempt to combine the information from all the devices in the network. This method requires that network device discovery program be run to find out what devices exist in the network, or that such a list of devices be provided to the program. These approximate AI methods require massive amounts of detailed and and very accurate knowledge about the internal tables and operations of all data communications devices in the network. These requirements make these AI methods complex, difficult to support and expensive. In addition, devices that do not provide connectivity information, such as ethernet or token ring concentrators must still be configured into the network topology by the administrative method. Finally the search of the AI methods has to be guided by expert humans for it to be successful, and even then there are many classes of topology it cannot determine. Consequently the approximate AI methods are not in general use.
The Loran traffic method exploits the fact that traffic flowing from one device to another device can be measured both as the output from the first and as the input to the second. Should the volume of traffic be counted periodically as it leaves the first and as it arrives at the second, the two sequences of measurements of the volumes will tend to be very similar. The sequences of measurements of traffic leaving or arriving at other devices will, in general, tend to be different because of the random (and fractal) nature of traffic. Therefore, the devices which have the most similar sequences will be most likely to be interconnected. Devices can be discovered to be connected in pairs, in broadcast networks or in other topologies. This method is therefore extremely general. However it depends on reasonably accurate measurements of traffic being made in both devices. In practice some devices do not report any information at all, let alone traffic. Other devices report incorrect values of traffic.
A method described in U.S. Pat. No. 5,450,408 issued Sep. 12, 1995, invented by Phall et al and assigned to Hewlett Packard Company relies on monitoring the source and destination of packets on lines in the network. From the sets of to and fro addresses the topology is eventually deduced. This requires hardware packet detectors to be added to many of the lines in the network and has nothing in common with the present invention.
An embodiment of the invention uses any source address to port mapping information in a device. Examples are bridge table, arp table, link training and source address capture data to determine classes of network topologies never previously determinable. In particular, the classes of topologies where one or more non reporting devices exist between sets of reporting devices are correctly determined. It includes a novel concept of up and down ports. Up ports interconnect devices which report tables, down ports do not. This concept dramatically reduces the computational complexity and greatly increases the accuracy of connection determination. The methods for distinguishing up and down are novel.
An embodiment of the invention also includes the novel determination that if an up port sees a source address also seen in a down port table then the up port sees the source address of the data-relay device with the down port. This removes entirely the dependence on data-relay devices seeing the source addresses of other data-relay devices directly, which undesirable dependence is essential to the Cabletron methods.
An embodiment of the invention involves removal of invalid and moved source addresses from the table data and is novel and so are all the methods for doing so. This makes the method approximately 100 fold less prone to error. It is more and more necessary as the use of portable computers becomes more widespread.
An embodiment of the invention provides for the explicit tradeoff of the accuracy of connections against the rapidity with which changes in the network are tracked is novel.
An embodiment of the invention determines whole families of topologies previously only handled by traffic patterns in the aforenoted Loran patent application: eg: multiple connections between switches and segmented hubs. When the traffic data is unavailable or is misreported by devices, this embodiment fills the need.
The invention can operate entirely automatically and requires no operator intervention or manual assistance. This is quite unlike the Cabletron method which requires a human expert to help it by restricting its search.
In accordance with an embodiment of the invention a source address table is compiled for each port of each data-relay device. Such ports are then classified as up or down. Up ports connect directly or indirectly to other data-relay devices which report source address tables while down ports do not. Up ports can be recognised as their source address tables intersect the tables on two or more ports on a single other data-relay device. Moreover, source addresses in the tables of down ports are not duplicated in the table of any other down port but are duplicated in the tables of up ports that directly or indirectly connect to the data-relay device that contains that down port. After classification as down or up, each source address in each up port table is replaced by the source address of the data-relay devices containing the down port whose table contains that source address. The up port tables now contain only data-relay addresses and the addresses of non table reporting devices indirectly connected to up ports. The tables of pairs of up ports are compared by intersection and the minimal intersection defines the most probable connection for each up port. The source addresses of devices in the table of a down port are defined as being directly or indirectly connected to that down port. The method can be applied repeatedly and the probabilities aggregated to provide arbitrary accuracy. A variety of methods are used to remove invalid source addresses and the addresses of devices that have moved during the collection of the source address tables.
A discovery program can be used to determine the list of devices in the network. A poller program extracts any source address to port mapping information, such as bridge table, arp table, link training data, source address capture and other table data such as, for example, Cisco Discovery Protocol, Cabletron Securefast table data from data-relay devices and produces for each port the set of source addresses perceived by that port over a given period of time.
The set of source addresses for any port over a given period of time can be created by one of two methods: by completely emptying it before filling it for that entire period of time, or by constructing it from a series of subsets, which represent portions of that period of time.
Which ports see frames transmitted through other devices with tables is first determined. These ports are termed up ports. The other ports with tables are termed down ports. The up ports see addresses seen by two or more ports on a single other device. The problem of determining the topology is divided into two: determining connections to down ports and determining connections between up ports. All objects seen off a down port are directly or indirectly connected to that down port. Any up port seeing an object connected to a down port on device B must be seeing frames passed through B, so it must see B. The sets of objects like B seen in this manner by up ports are now compared. The pairs of up ports for which the intersections of these sets are minimal are defined as connected. The existence of non-null intersections or multiple null intersections for a single port indicate the existence of a non table reporting object connected to that port and that its connections to other up ports lie through that object.
The division of the problem into up and down and then exploiting the results from down connections to solve the problems for up connections is novel.
In accordance with an embodiment of the invention, a method of determining the topology of a data network comprised of network devices including data relay devices, comprises:
(a) obtaining source address to port mapping data from the data relay devices,
(b) producing for each port of each data relay device a set of source addresses perceived by each said port over a period of time,
(c) defining as up ports, those of said ports which have carried data transmitted through devices with said mapping data, which devices have other ports than a port under consideration, and defining remaining ports other than the up ports as down ports,
(d) defining connections to down ports from devices seen from a down port, and
(e) defining connections between up ports and between up and down ports from the source addresses.
In accordance with another embodiment, a method of determining topology of a data network comprised of data relay devices and node devices, each data relay device having one or more ports, comprises:
(a) compiling a source table for each port of each data relay device,
(b) classifying ports as up ports, those ports which connect directly or indirectly to other data relay devices which report source address tables,
(c) classifying ports which connect directly or indirectly to other data relay devices which do not report source address tables, as down ports,
(d) replacing each source address in each up port table by a source address of data relay devices containing the down port whose table contains that source address, whereby the up port tables thereby contain only data relay addresses and addresses of non table reporting devices indirectly connected to up ports,
(e) comparing port tables of pairs of ports by intersection, and
(f) defining a most probable connection for each up port by locating a minimal intersection.