The present invention relates generally to computer networks, and more specifically, to a method and apparatus for improving and facilitating the identification and selection of loop-free topologies in computer networks.
A computer network typically comprises a plurality of interconnected entities. An entity may consist of any device, such as a computer or end station, that xe2x80x9csourcesxe2x80x9d (i.e., transmits) or xe2x80x9csinksxe2x80x9d (i.e., receives) data frames. A common type of computer network is a local area network (xe2x80x9cLANxe2x80x9d) which typically refers to a privately owned network within a single building or campus. LANs typically employ a data communication protocol (LAN standard), such as Ethernet, FDDI or token ring, that defines the functions performed by the data link and physical layers of a communications architecture (i.e., a protocol stack). In many instances, several LANs may be interconnected by point-to-point links, microwave transceivers, satellite hook-ups, etc. to form a wide area network (xe2x80x9cWANxe2x80x9d) or intranet that may span an entire country or continent.
One or more intermediate network devices are often used to couple LANs together and allow the corresponding entities to exchange information. For example, a bridge may be used to provide a xe2x80x9cbridgingxe2x80x9d function between two or more LANs. Alternatively, a switch may be utilized to provide a xe2x80x9cswitchingxe2x80x9d function for transferring information between a plurality of LANs or end stations. Typically, the bridge or switch is a computer and includes a plurality of ports that couple the device to the LANs or end stations. Ports used to couple switches to each other are generally referred to as a trunk ports, whereas ports used to couple a switch to LANs or end stations are generally referred to as access ports. The switching function includes receiving data from a sending entity at a source port and transferring that data to at least one destination port for forwarding to the receiving entity.
Switches and bridges typically learn which destination port to use in order to reach a particular entity by noting on which source port the last message originating from that entity was received. This information is then stored by the bridge in a block of memory referred to as a filtering database. Thereafter, when a message addressed to a given entity is received on a source port, the bridge looks up the entity in its filtering database and identifies the appropriate destination port to reach that entity. If no destination port is identified in the filtering database, the bridge floods the message out all ports, except the port on which the message was received. Messages addressed to broadcast or multicast addresses are also flooded.
Additionally, most computer networks include redundant communications paths so that a failure of any given link or device does not isolate any portion of the network. The existence of redundant links, however, may cause the formation of circuitous paths or xe2x80x9cloopsxe2x80x9d within the network. Loops are highly undesirable because data frames may traverse the loops indefinitely. Furthermore, because switches and bridges replicate (i.e., flood) frames whose destination port is unknown or which are directed broadcast or multicast addresses, the existence of loops may cause a proliferation of data frames that effectively overwhelms the network.
Spanning Tree Algorithm
To avoid the formation of loops, most intermediate network devices execute a spanning tree algorithm which allows them to calculate an active network topology that is loop-free (i.e., a tree) and yet connects every pair of LANs within the network (i.e., the tree is spanning). The Institute of Electrical and Electronics Engineers (IEEE) has promulgated a standard (the 802.1D standard) that defines a spanning tree protocol to be executed by 802.1D compatible devices. In general, by executing the spanning tree algorithm, bridges elect a single bridge to be the xe2x80x9crootxe2x80x9d bridge. Since each bridge has a unique numerical identifier (bridge ID), the root is typically the bridge with the lowest bridge ID. In addition, for each LAN coupled to more than one bridge, only one (the xe2x80x9cdesignated bridgexe2x80x9d) is elected to forward frames to and from the respective LAN. The designated bridge is typically the one closest to the root. Each bridge also selects one port (its xe2x80x9croot portxe2x80x9d) which gives the lowest cost path to the root. The root ports and designated bridge ports are selected for inclusion in the active topology and are placed in a forwarding state so that data frames may be forwarded to and from these ports and thus onto the corresponding paths or links of the network. Ports not included within the active topology are placed in a blocking state. When a port is in the blocking state, data frames will not be forwarded to or received from the port. A network administrator may also exclude a port from the spanning tree by placing it in a disabled state.
To obtain the information necessary to run the spanning tree protocol, bridges exchange special messages called configuration bridge protocol data unit (BPDU) messages. FIG. 1 is a block diagram of a conventional BPDU message 100. The BPDU message 100 includes a message header 102 compatible with the Media Access Control (MAC) layer of the respective LAN standard. The message header 102 comprises a destination address (DA) field 104, a source address (SA) field 106, and a Service Access Point (SAP) field 108, among others. The DA field 104 carries a unique bridge multicast destination address assigned to the spanning tree protocol. Appended to header 102 is a BPDU message area 110 that also contains a number of fields, including a root identifier (ROOT ID) field 112, a root path cost field 114, a bridge identifier (BRIDGE ID) field 116, a port identifier (PORT ID) field 118, a message age (MSG AGE) field 120, a maximum age (MAX AGE) field 122, a hello time field 124, and a forward delay (FWD DELAY) field 126, among others. The root identifier field 112 typically contains the identifier of the bridge assumed to be the root and the bridge identifier field 116 contains the identifier of the bridge sending the BPDU. The root path cost field 114 contains a value representing the cost to reach the assumed root from the port on which the BPDU is sent and the port identifier field 118 contains the port number of the port on which the BPDU is sent.
Each bridge initially assumes itself to the be the root and transmits BPDU messages accordingly. As a result, bridges continuously receive BPDU messages. Upon receipt of a BPDU message, its contents are examined and compared with similar information (e.g., assumed root and lowest root path cost) stored by the receiving bridge. If the information from the received BPDU is xe2x80x9cbetterxe2x80x9d than the stored information, the bridge adopts the better information and uses it in the BPDUs that it sends (adding the cost associated with the receiving port to the root path cost) from its ports, other than the port on which the xe2x80x9cbetterxe2x80x9d information was received. Although BPDU messages are not forwarded by bridges, the identifier of the root is eventually propagated to and adopted by all bridges as described above, allowing them to select their root port and any designated port(s).
In order to adapt the active topology to failures, the root periodically (e.g., every hello time) transmits BPDU messages. The hello time utilized by the root is also carried in the hello time field 124 of its BPDU messages. The default hello time is two seconds. In response to receiving BPDUs, bridges transmit their own BPDUs. Thus, every two seconds BPDUs are propagated throughout the bridged network, thereby confirming the active topology. As shown in FIG. 1, BPDU messages stored by the bridges also include a message age field 120 which corresponds to time since the root instigated the generation of this BPDU information. That is, BPDU messages from the root have their message age field 120 set to xe2x80x9c0xe2x80x9d. Thus, every hello time, BPDU messages with a message age of xe2x80x9c0xe2x80x9d, are propagated to and stored by the bridges.
After storing these BPDU messages, bridges proceed to increment the message age value every second. When the next BPDU message is received, the bridge examines the contents of the message age field 120 to determine whether it is smaller than the message age of its stored BPDU message. Assuming the received BPDU message originated from the root and thus has a message age of xe2x80x9c0xe2x80x9d, the received BPDU message is considered to be xe2x80x9cbetterxe2x80x9d than the stored BPDU information (whose message age has presumably been incremented to xe2x80x9c2xe2x80x9d seconds) and, in response, the bridge proceeds to re-calculate the root, root path cost and root port based upon the received BPDU information. The bridge also stores this received BPDU message and proceeds to increment its message age field 120. If the message age of a stored BPDU message reaches a maximum age value, the corresponding BPDU information is considered to be stale and is discarded by the bridge.
Normally, each bridge replaces its stored BPDU information every hello time, thereby preventing it from being discarded and maintaining the current active topology. If a bridge stops receiving BPDU messages on a given port (indicating a possible link or device failure), it will continue to increment the respective message age value until it reaches the maximum age threshold. The bridge will then discard the stored BPDU information and proceed to recalculate the root, root path cost and root port by transmitting BPDU messages utilizing the next best information it has. The maximum age value used within the bridged network is typically set by the root, which enters the appropriate value in the maximum age field 122 of its transmitted BPDU messages. Neighboring bridges similarly load this value in their BPDU messages, thereby propagating the selected value throughout the network. The maximum age value under the IEEE standard is twenty seconds.
As BPDU information is up-dated and/or timed-out and the active topology is re-calculated, ports may transition from the blocking state to the forwarding state and vice versa. That is, as a result of new BPDU information, a previously blocked port may learn that it should be in the forwarding state (e.g., it is now the root port or a designated port). Rather than transition directly from the blocking state to the forwarding state, ports transition through two intermediate states: a listening state and a learning state. In the listening state, a port waits for information indicating that it should return to the blocking state. If, by the end of a preset time, no such information is received, the port transitions to the learning state. In the learning state, a port still blocks the receiving and forwarding of frames, but received frames are examined and the corresponding location information is stored in the filtering database, as described above. At the end of a second preset time, the port transitions from the learning state to the forwarding state, thereby allowing frames to be forwarded to and from the port. The time spent in each of the listening and the learning states is referred to as the forwarding delay and is entered by the root in field 126.
Although the spanning tree protocol is able to maintain a loop-free topology despite network changes and failures, re-calculation of the active topology can be a time consuming and processor intensive task. For example, re-calculation of the spanning tree following a network change or failure can take approximately 50 seconds (e.g., 20 seconds for BPDU information to time out, 15 seconds in the listening state and another 15 seconds in the learning state). During this time, message delivery is often delayed as ports transition between states. That is, ports in the listening and learning states do not forward or receive messages. In addition, certain applications or processes may time-out and shut down while the active topology is re-calculated, resulting in even greater disruptions.
The conventional spanning tree protocol also consumes significant processor resources, which may degrade network performance. More specifically, bridges recalculate the root and their root port and root path cost every time a xe2x80x9cnewerxe2x80x9d BPDU message is received (e.g., every 2 seconds). As the active topology nears convergence and once it has converged, the root identifier and root path cost of these newer BPDU messages are identical to the stored BPDU information. That is, the processing of the received BPDU information will cause no change in the bridge""s port states. Nevertheless, the bridge still proceeds to re-calculate the active topology, wasting valuable processor resources.
The prior art spanning tree protocol is also unable to identify and eliminate all possible loops. In particular, some network configurations result in messages being looped-back to the port on which they were forwarded. For example, certain network cables or links loop-back messages. Additionally, a port may be configured by a network administrator to return copies of messages forwarded to the port. If such a configuration exists and the port is forwarding, then an undetected loop may arise. For example, a broadcast message forwarded from the port will loop-back (i.e., be returned) to the switch. The bridge, moreover, will assume that this is a new broadcast message and proceed to forward it on all of its other forwarding ports. The resulting proliferation of messages can overwhelm the network.
The existence of such a loop may not be detected by the conventional spanning tree protocol. More specifically, BPDU messages that are forwarded on such loop-back configured ports will similarly be returned to the transmitting port and the information in these xe2x80x9creceivedxe2x80x9d BPDU messages will be compared against the information currently stored for that port. As the information from these received BPDU messages cannot be xe2x80x9cbetterxe2x80x9d than the stored information (i.e., it is the same), the BPDU message is simply ignored and the bridge transitions the port to the forwarding state. Accordingly, the loop is not discovered and subsequent message proliferation may occur.
The spanning tree protocol""s ability to define an active topology also degrades significantly in the presence of network congestion. Congestion refers to the inability of intermediate network devices to keep up with an increase in network traffic. More specifically, each network device typically has one or more priority queues associated with each port or interface. As messages are received, they are placed in the appropriate queue for forwarding. If messages are added to a given queue faster than they can be forwarded, however, the queue will eventually be filled forcing the device to drop any additional messages (including BPDU messages) for that queue. This may cause a downstream switch to stop receiving BPDUs on a blocked port, even though no failure or network change has occurred. In response, the BPDU information stored at the downstream switch may time out and be discarded. The downstream bridge may then transition its port from blocking to forwarding. The transition of this port to forwarding creates a loop (because the upstream port, although congested, is still in the forwarding state) and only adds to the congestion problem.
It is an object of the present invention to provide a method and apparatus for enhancing the operation of the spanning tree protocol in computer networks.
It is a further object of the present invention to provide a method and apparatus for reducing the time necessary to transition certain ports to a forwarding state.
Another object of the present invention is to provide a method and apparatus for detecting and blocking loops caused by loop-back connections or configurations.
Briefly, the invention relates to a method and apparatus for enhancing the operation of the spanning tree protocol. An intermediate network device, such as a switch or bridge, includes an enhanced spanning tree entity that is configured to execute a spanning tree protocol. The enhanced spanning tree entity, which includes an extractor module and a state machine engine, performs a plurality of novel functions that improve the execution and performance of the spanning tree protocol. First, the enhanced entity identifies loop-back ports. More specifically, the enhanced spanning tree entity examines the configuration bridge protocol data unit (BPDU) messages that are received and determines, among other things, whether these received BPDUs are identical to the BPDUs forwarded on those ports. If so, the enhanced entity detects the presence of a loop-back cable or configuration and transitions the respective port to the blocking state to prevent message proliferation.
In another aspect of the invention, the enhanced spanning tree engine includes a method for transitioning certain ports directly to a forwarding state to prevent associated applications from timing out. More specifically, one or more ports of the device can be configured as access ports. Normally, an access port is only coupled to a specific entity (e.g., a server or end station) or a LAN and does not provide connectivity to other portions or segments of the computer network. Thus, BPDU messages are not received on access ports (unless there is a loop-back condition). In accordance with the invention, one or more access ports may be also be configured as xe2x80x9crapid forwardingxe2x80x9d. Upon initialization, the enhanced spanning tree entity preferably examines the configuration of each port. If a port is configured as an access port with rapid forwarding, then the entity preferably causes that port to transition directly to the forwarding state. That is, the enhanced spanning tree entity by-passes the conventional blocking, listening and learning states and instead, places the port immediately in the forwarding state. Messages can thus be forwarded to and from the port right away. Since this function is only to be enabled on access ports (which would eventually become designated ports), loops are unlikely to result.