1. Field of the Invention
The present invention relates to network protocols and network intermediate devices executing such protocols; and more particularly to algorithms for managing the tree of network devices for a data network according to a spanning tree protocol.
2. Description of Related Art
Local area networks (LANs) specified according to Institute of Electrical and Electronic Engineers (IEEE) Standards for Local and Metropolitan Area Networks under section 802.x of all types may be connected together with media access control (MAC) bridges. Bridges interconnect LAN segments so that stations connected to the LANs operate as if they were attached to a single LAN for many purposes. Thus a bridged local area network provides for interconnection of stations attached to LAN segments of different MAC types, for an increase in the physical extent, the number of permissible attachments and the total performance of a LAN, and for the partitioning of physical LAN support for administrative or maintenance reasons: The MAC bridge is specified according to the IEEE standard 802.1D (IEEE Std 802.1D-1990, IEEE Standards for Local and Metropolitan Area Networks: Media Access Control (MAC) Bridges.) The protocol has application for establishing interconnection of devices on network segments (whether the segments are characterized as LANs or as other network constructs) in any type of network.
When a bridged network is established, it is possible to create loops in the network by providing more than one path through bridges and LAN segments between two points. Thus, according to the 802.1D standard, an active topology for the bridged network is maintained according to the spanning tree protocol which is described in the standard. The spanning tree protocol automatically establishes a fully connected (spanning) and loop-free (tree) bridged network topology. It uses a distributed algorithm that selects a root bridge and the shortest path to that root from each LAN. Tie breakers are used to ensure that there is a unique shortest path to the root, while uniqueness of the root is guaranteed by using one of its MAC addresses as part of a priority identifier.
Every LAN in the network has one and only one xe2x80x9cdesignated portxe2x80x9d providing that LAN""s shortest path to the root, through the bridge of which the designated port is a part. The bridge is known as the designated bridge for that LAN.
Thus, bridges other than the root bridge at the root of the network can be termed a branch bridge. Every branch bridge has a xe2x80x9croot portxe2x80x9d which is the port providing that bridge""s shortest path to the root. Ports other than the root port are designated ports, or alternate ports according to the standard. An alternate port is connected to a LAN for which another bridge is the designated bridge, and is placed in a blocking state so that frames are not forwarded through that port.
The frame forwarding path through any bridge is thus between its root port and designated ports. When spanning tree information has been completely distributed and is stable, this connectivity will connect all of the LANs in a loop-free tree.
When a bridge first receives spanning tree information that dictates new connectivity through that bridge, it does not establish the new connectivity immediately. Ports that were connected previously as either the root port or a designated port, but are no longer connected, are immediately made blocking. However, the transition to a forwarding state of ports that were previously not connected in a forwarding role is delayed. The delay is needed because:
(a) Frames forwarded on the previous topology may still be buffered by bridges in the network. Thus an instantaneous change to the new topology can cause these to be forwarded back to their LAN of origin, causing duplication of the frame once.
(b) New spanning tree information in the network may not have been fully distributed yet. Thus an immediate change to a new topology may cause temporary loops. These loops could generate high traffic volumes, disrupting end stations, causing frame loss in bridges, and possibly delaying the propagation of spanning tree information further.
The first of these two reasons is far less important than it once was, because the protocols prevalent on LANs today deal with immediately duplicated frames. Some old implementations of LLC type 2 will reset connection under these circumstances, but they are no longer in widespread deployment. Thus the problem presented by reason (a) is of less significance than reason (b).
Reason (b) continues to be a fundamental problem to the spanning tree configuration.
According to the spanning tree protocol of the standard, each port on a bridge can assume a blocking state in which frames are not forwarded through the port, or a forwarding state in which frames are forwarded through the port. For a transition from the blocking state to the forwarding state, the protocol requires the port to proceed through transitional states referred to as the listening state and the learning state. In the listening state, the port is preparing to participate in frame relay, however frame relay is temporarily disabled to prevent temporary loops. In the listening state, the port monitors bridge protocol data unit (BPDU) frames or other information related to the topology in the network for an interval referred to as the forward delay timer. If no information is received which causes a change in state of the port before expiry of the forward delay timer, then the port transitions to the learning state.
In the learning state, the port continues to prepare for participation in frame relay. The relay is temporarily disabled to prevent loops. In this state, in addition to monitoring BPDU frames and other information related to operation of the spanning tree algorithm, the port learns information about end stations that are accessible through the port for use in the forwarding of frames once the frame enters the forwarding state. Upon expiration of the forward delay timer in the learning state, if no better information about the protocol is received, then the port assumes the forwarding state. Thus, the transition from a blocking state to the forwarding state takes two times as long as the forward delay timer interval. From the moment of detection of a change in topology which causes a transition from the blocking to the forwarding state, until the moment that the forwarding state is assumed, can be a significant amount of time, as much as 20 to 50 seconds in some cases. Thus, when a link or switch fails, reconfiguration takes place at unacceptably slow rates for mission critical networks. Significantly reducing this recovery time remains a problem.
Three approaches to managing reconfiguration times include the following:
(1) spanning tree timer values can be manually configured for optimal values;
(2) a scheme known as xe2x80x9cbackbone fastxe2x80x9d detects the changing topology and allows a bridge to determine whether or not connectivity in the network has been lost, by sending a test packet called a root link query PDU to the bridge in the network that is the root in the spanning tree protocol; and
(3) network topologies can be specially designed to provide fail over in some cases without requiring expiration of the relevant timer, as has been described in the above cross referenced application Ser. No. 09/141,803.
Managing timers according to the first approach listed above is error prone, and negates the low administration benefits of the standard spanning tree. Further, the timers must be set to values that are a worst-case to some very high probability, so the first approach listed above provides limited improvement.
The xe2x80x9cbackbone fastxe2x80x9d scheme of the second approach depends on a particular bridge being the root bridge, so the scheme cannot be introduced into a network by upgrading arbitrary pairs of bridges. If the network topology is changing, the root link query message to the root may not reach the actual root and cause the wrong initial steps to be taken in the effort to speed reconfiguration. Thus, the xe2x80x9cbackbone fastxe2x80x9d scheme will speed part of the reconfiguration cycle but does not complete the entire reconfiguration unless the topology has been specially designed. Further, the reconfiguration necessary may be best indicated by absence of a reply from the root, so it is necessary for the protocol to rely on some worst case estimate of the time during which a reply from the root can be expected, for managing the reconfiguration.
The third approach listed above provides a step toward a solution to the problem of reducing the recovery time of networks. The solutions described in the cross referenced U.S. patent application Ser. No. 09/141,803 allow root ports to transition to forwarding states very quickly, but still require designated ports to transition through the listening and learning states. In an arbitrary network topology, recovery of connectivity after a bridge or link failure can require a bridge port that was previously an alternate port in the spanning tree to become a designated port. Traversing the transitional states involves a delay of 30 seconds when standard default timer values are used.
Convergence of a bridged network in situations involving changing of spanning tree topology can therefore cause significant loss-of-service situations, particularly in networks that carry real time data. For example, the use of data networks and the Internet for audio and video transmissions of real time signals is increasing. Twenty to fifty second convergence times for these uses of the data network can cause unacceptable glitches. Accordingly, it is desirable to provide a technique to improve the availability of a bridged network in the face of changes in topology.
The present invention provides new mechanisms for use on designated ports in spanning tree protocol entities which allow such ports to transition to a forwarding state on the basis of actual communication delays between neighboring bridges, rather than upon expiration of worst case timers.
According to the invention, the logic that manages transition of states in the spanning tree protocol entity identifies ports which are changing to a designated port role, and issues a message on such ports informing the downstream port that the issuing port is able to assume a forwarding state. The logic in the preferred embodiment begins the standard delay timer for entry into the listening state and then the learning state, prior to assuming the forwarding state. However, when a reply from the downstream port is received, the issuing port reacts by changing immediately to the forwarding state without continuing to await expiration of the delay timer and without traversing transitional listening and learning states.
A downstream port which receives a message from an upstream port indicating that it is able to assume a forwarding state reacts by ensuring that no loop will be formed by the change in state of the upstream port. In one embodiment, the downstream port changes the state of all of designated ports that were recently root ports (e.g. designated ports which were root ports within two times the forward delay time for a typical network) on the protocol entity to a blocking state, and then issues messages downstream indicating that such designated ports are ready to resume the forwarding state. The designated ports on the downstream protocol entity await a reply from ports further downstream. In this way, loops are blocked step-by-step through the network, as the topology of the tree settles.
According to one aspect of the invention, the protocol entities include logic which manages the transition of states for a particular port changing from an alternate port role to a root port role by causing transition from the blocking state to the forwarding state without requiring satisfaction of a condition of a transitional state or states. The transitional states which are skipped in the spanning tree standard, for the alternate port transitions to the root port role, are the transitional listening and learning states.
In one embodiment, the invention provides a network device for a network comprising a plurality of local area network segments. The device comprises a plurality of ports coupled to segments in the network. Topology management resources manage the plurality of ports according to a spanning tree algorithm to set an active topology. The topology management resources include memory storing parameters for specifying the active topology. The parameters include information for identification of a root of the network, identification of a port in the plurality of ports for a root port role to be used as a path to the root, identification of one or more ports for the designated port roles to be used as paths between the root and respective segments coupled to the one or more ports, and identification of one or more ports in the plurality of ports for alternate port roles. The topology management resources also include logic to compute states for ports in the plurality of ports in response to the parameters. Ports in the root port role are placed in a forwarding state. Ports in the designated port roles are placed in a forwarding state. Ports in the alternate port roles are placed in a blocking state. The topology management resources further include logic to manage transition of states of the ports in response to the changing of the active topology, following the rules for designated ports and root ports described above.
In one embodiment, the invention is an improvement of the IEEE Standard 802.1D spanning tree protocol. Messages traded among the protocol entities in the standard, as enhanced according to present invention, are bridge protocol data units which are modified to include flags indicating that the issuing upstream port is ready to assume the forwarding role, and indicating that the issuing downstream port is ready to allow its upstream port to assume the forwarding role.
The present invention is particularly suitable for use with devices that are interconnected by point-to-point communication links. The invention may be extended to devices which are interconnected by shared media, with the addition of techniques that will allow devices which are attached to agree on the designated bridge before the transition is allowed.
The present invention allows for migration smoothly from a legacy network based on the prior art spanning tree protocol to a highly available network without significant additional administrative overhead. Thus, the spanning tree root port can be moved, and can start forwarding frames immediately if the previous root port no longer forwards frames, such as in the case of a physical link failure. Ports becoming designated move to the forwarding state based on an exchange of messages with neighboring devices. The improvement of the present invention is fully compatible with existing standard switches.