The present invention relates to computer networks and, more particularly, to a system for managing such networks. A major objective of the invention is to provide for automated port disablement to eliminate loops and other network-performance related problems.
Networks provide for communications among computer-related node devices, for example, computers, computer peripherals (e.g., printers), and various types of instrumentation (e.g., medical instrumentation). While there are peer-to-peer networking systems, most sophisticated networks require one or more network hubs that are external to the computers and help manage inter-device communications.
A typical hub has three or more ports, each of which can be coupled to a node device or another hub by a suitable cable. There are many types of hubs, including repeaters, switches, and routers. Routers and switches direct what is received only out the port or ports required for the data to reach its destination. Repeaters broadcast whatever is received at one port out the remaining ports. Unlike switches and routers, repeaters do not analyze through-going data to determine where to send it. Accordingly, they tend to be simpler and faster. As a result, they are the numerically most prevalent type of hub. Although all types of hubs are addressed herein, repeaters are the primary focus herein.
Just as individuals rely increasingly on computers for getting their work done, corporations rely increasingly on networks for getting their personnel to work cooperatively. When a network fails, group efforts grind to a halt, as do individual efforts relying on network resources. Accordingly, maintaining a network working and performing at optimal levels is highly desirable, if not critical.
To aid in the detection and diagnosis of network problems, network hubs typically include a number of counters that detect and count events bearing on network performance. Some counters are dedicated to respective ports, while others count events regardless of the receiving port. For example, one type of port-dedicated counter detects and counts received packets that are longer than permitted by the network protocol. By sampling such a counter at two different times, a measure of the frequency of excessively long packets is derived. If this frequency is itself excessive, it indicates that a connected node has a problem. For example, either an included network interface card (NIC) or a software driver for the card is defective.
For a second example, a collision counter can be used to indicate a condition in which network bandwidth is being exceeded. Sophisticated network protocols can handle many concurrent transmissions and the assignment of priority according to data type. In general, data is transmitted one packet at a time. If two packets happen to be transmitted on the same cable concurrently, they are said to "collide". Sophisticated networks detect such collisions and require the transmitters to re-send the colliding packets at different times. An excessive frequency of collisions indicates that the network capacity is being exceeded. Other examples include the use of late-event counters to indicate the presence of excessively long cables or an excessive number of repeaters connected in series, and the use of counters that detect various types of packet defects to indicate a defective cable.
A network problem of particular interest herein is the elimination of network loops. Since a hub can include many ports, it can and does happen that a first port will be cabled directly to a second port of the same hub. In the case of a repeater loop, data received at the first port is transmitted out the second port; the same data is received again at the first port and is transmitted again out the second port and so on. This condition is known as a "local loop". Unless otherwise addressed, a local loop causes collisions to occur with every packet and re-sending does not avoid this problem. This condition can result in complete failure of the network (or network segment). Local loops are readily detectable since every packet results in a collision.
Modern repeaters include hardware that automatically autopartitions ports to break local loops so that a network can sometimes continue operation. A packet received at an autopartitioned port is not transmitted out the remaining ports; however, the port is monitored for collisions. If this monitoring suggests the absence of problems, the autopartition can be automatically terminated (as specified in IEEE Std. 802.3). However, this automatic termination of partitions sometimes occurs when a loop is still present; thus, reliable network operation is not enabled in the presence of loops.
There are also non-local "indirect" loops that involve a hub and at least one other device. For example, two hubs can be coupled to each other at two different ports. For another example, a cabling error can cause a group of hubs that are supposed to be daisy-chained to be arranged into a loop. There are innumerable other types of loops, including loops that extend through node devices in various ways.
In part due to the variety of indirect loops, they can be hard to detect. While they tend to involve large numbers of collisions, there are other causes of collisions. Thus, while many hubs include counters that count collisions, the cause of the collisions is not readily determined. In part due to packet transit delays through other devices, it is more difficult to determine when a collision involves a packet colliding with itself. Thus, autopartitioning may not be activated in the event of an indirect loop. Accordingly, indirect loop detection has been the province of a network administrator using specialized network monitoring tools.
Network monitoring tools tend to be difficult to set up and difficult to use. In some cases, specialized network diagnostic hardware is required. In other cases, a general-purpose workstation is configured with special software as a network administration station. The network software tends to be arcane, requiring specially skilled network administrators to operate.
Often, the required set up for the network monitoring tools is never performed or is not performed correctly. As a result, consultants must be brought in to fix an impaired network. Thus, correcting indirect loops can be expensive, both for the labor involved but also in the loss of productivity for the users of the network. What is needed is an improved system for detecting and correcting for loops and other network problems impairing operation and performance.