1. Field of the Invention
The present invention relates generally to computer networks, and more specifically, to a method and apparatus for quickly and efficiently resuming the forwarding of network messages despite network changes and failures.
2. Background Information
A computer network typically comprises a plurality of interconnected entities. An entity may consist of any device, such as a computer or end station, that “sources” (i.e., transmits) or “sinks” (i.e., receives) data frames. A common type of computer network is a local area network (“LAN”) which typically refers to a privately owned network within a single building or campus. LANs typically employ a data communication protocol (LAN standard), such as Ethernet, FDDI or token ring, that defines the functions performed by the data link and physical layers of a communications architecture (i.e., a protocol stack). In many instances, several LANs may be interconnected by point-to-point links, microwave transceivers, satellite hook-ups, etc. to form a wide area network (“WAN”) or intranet that may span an entire country or continent.
One or more intermediate network devices are often used to couple LANs together and allow the corresponding entities to exchange information. For example, a bridge may be used to provide a “bridging” function between two or more LANs. Alternatively, a switch may be utilized to provide a “switching” function for transferring information between a plurality of LANs or end stations. Typically, the bridge or switch is a computer and includes a plurality of ports that couple the device to the LANs or end stations. The switching function includes receiving data from a sending entity at a source port and transferring that data to at least one destination port for forwarding to the receiving entity.
Switches and bridges typically learn which destination port to use in order to reach a particular entity by noting on which source port the last message originating from that entity was received. This information is then stored by the bridge in a block of memory referred to as a filtering database. Thereafter, when a message addressed to a given entity is received on a source port, the bridge looks up the entity in its filtering database and identifies the appropriate destination port to reach that entity. If no destination port is identified in the filtering database, the bridge floods the message out all ports, except the port on which the message was received. Messages addressed to broadcast or multicast addresses are also flooded.
Additionally, most computer networks are either partially or fully meshed. That is, they include redundant communications paths so that a failure of any given link or device does not isolate any portion of the network. The existence of redundant links, however, may cause the formation of circuitous paths or “loops” within the network. Loops are highly undesirable because data frames may traverse the loops indefinitely. Furthermore, because switches and bridges replicate (i.e., flood) frames whose destination port is unknown or which are directed to broadcast or multicast addresses, the existence of loops may cause a proliferation of data frames so large that the network becomes overwhelmed.
Spanning Tree Protocol
To avoid the formation of loops, most bridges and switches execute a spanning tree protocol which allows them to calculate an active network topology that is loop-free (i.e., a tree) and yet connects every pair of LANs within the network (i.e., the tree is spanning). The Institute of Electrical and Electronics Engineers (IEEE) has promulgated a standard (the 802.1D standard) that defines a spanning tree protocol to be executed by 802.1D compatible devices. In general, by executing the 802.1D spanning tree protocol, bridges elect a single bridge within the bridged network to be the “root” bridge. The 802.1D standard takes advantage of the fact that each bridge has a unique numerical identifier (bridge ID) by specifying that the root is the bridge with the lowest bridge ID. In addition, for each LAN coupled to more than one bridge, only one (the “designated bridge”) is elected to forward frames to and from the respective LAN. The designated bridge is typically the one closest to the root. Each bridge also selects one port (its “root port”) which gives the lowest cost path to the root. The root ports and designated bridge ports are selected for inclusion in the active topology and are placed in a forwarding state so that data frames may be forwarded to and from these ports and thus onto the corresponding paths or links of the network. Ports not included within the active topology are placed in a blocking state. When a port is in the blocking state, data frames will not be forwarded to or received from the port. A network administrator may also exclude a port from the spanning tree by placing it in a disabled state.
To obtain the information necessary to run the spanning tree protocol, bridges exchange special messages called configuration bridge protocol data unit (BPDU) messages. More specifically, upon start-up, each bridge initially assumes itself to be the root and transmits BPDU messages accordingly. Upon receipt of a BPDU message from a neighboring device, its contents are examined and compared with similar information (e.g., assumed root and lowest root path cost) stored by the receiving bridge in non-recoverable memory. If the information from the received BPDU is “better” than the stored information, the bridge adopts the better information and uses it in the BPDUs that it sends (adding the cost associated with the receiving port to the root path cost) from its ports, other than the port on which the “better” information was received. Although BPDU messages are not forwarded by bridges, the identifier of the root is eventually propagated to and adopted by all bridges as described above, allowing them to select their root port and any designated port(s).
In order to adapt the active topology to changes and failures, the root periodically (e.g., every hello time) transmits BPDU messages. The default hello time is 2 seconds. In response to receiving BPDUs on their root ports, bridges transmit their own BPDUs from their designated ports, if any. Thus, every two seconds BPDUs are propagated throughout the bridged network, confirming the active topology. That is, normally, each bridge replaces its stored BPDU information every hello time, thereby preventing it from being discarded and maintaining the current active topology. If a bridge stops receiving BPDU messages on a given port (indicating a possible link or device failure), it will continue to increment a respective message age value until it reaches the maximum age threshold. The bridge will then discard the stored BPDU information and proceed to recalculate the root, root path cost and root port by transmitting BPDU messages utilizing the next best information it has. The maximum age value used within the bridged network is typically set by the root, which enters the appropriate value in its BPDU messages.
As BPDU information is updated and/or timed-out and the active topology is recalculated, ports may transition from the blocking state to the forwarding state and vice versa. That is, as a result of new BPDU information, a previously blocked port may learn that it should be in the forwarding state (e.g., it is now the root port or a designated port). Rather than transition directly from the blocking state to the forwarding state, the 802.1D standard calls for ports to transition through two intermediate states: a listening state and a learning state. In the listening state, a port waits for information indicating that it should return to the blocking state. If, by the end of a preset time, no such information is received, the port transitions to the learning state. In the learning state, a port still blocks the receiving and forwarding of frames, but received frames are examined and the corresponding location information is stored in the bridge's filtering database. At the end of a second preset time, the port transitions from the learning state to the forwarding state, thereby allowing frames to be forwarded to and from the port. The time spent in each of the listening and the learning states is referred to as the forwarding delay.
Although the spanning tree protocol provided in the 802.1D standard is able to maintain a loop-free topology despite network changes and failures, re-calculation of the active topology can be a time consuming and processor intensive task. For example, recalculation of the spanning tree following an intermediate device crash or failure can take approximately thirty seconds. During this time, message delivery is often delayed as ports transition between states. Such delays can have serious consequences on time-sensitive traffic flows, such as voice or video traffic streams.
Rapid Spanning Tree Protocol
Recently, the IEEE promulgated a new standard (the 802.1w standard) that defines a rapid spanning tree protocol (RSTP) to be executed by otherwise 802.1D compatible devices. The RSTP similarly selects one bridge of a bridged network to be the root bridge and defines an active topology that provides complete connectivity among the LANs while severing any loops. Each individual port of each bridge is assigned a port role according to whether the port is to be part of the active topology. The port roles defined by the 802.1w standard include Root, Designated, Alternate and Backup. The bridge port offering the best, e.g., lowest cost, path to the root is assigned the Root Port Role. Each bridge port offering an alternative, e.g., higher cost, path to the root is assigned the Alternate Port Role. Each bridge port providing the lowest cost path from a given LAN is assigned the Designated Port Role, while all other ports coupled to the given LAN in loop-back fashion are assigned the Backup Port Role.
Those ports that have been assigned the Root Port and Designated Port Roles are placed in the forwarding state, while ports assigned the Alternate and Backup Roles are placed in a discarding or blocking state. A port assigned the Root Port Role can be rapidly transitioned to the forwarding state provided that all of the ports assigned the Alternate Port Role are placed in the discarding or blocking state. Similarly, if a failure occurs on the port currently assigned the Root Port Role, a port assigned the Alternate Port Role can be reassigned to the Root Port Role and rapidly transitioned to the forwarding state, providing that the previous root port has been transitioned to the discarding or blocking state. A port assigned the Designated Port Role or a Backup Port that is to be reassigned to the Designated Port Role can be rapidly transitioned to the forwarding state, provided that the roles of the ports of the downstream bridge are consistent with this port being assigned the Designated Port Role. The RSTP provides an explicit handshake to be used by neighboring bridges to confirm that a new designated port can rapidly transition to the forwarding state.
Like the STP described in the 802.1D specification standard, bridges running RSTP also exchange BPDU messages in order to determine which roles to assign to the bridge's ports. The BPDU messages are also utilized in the handshake employed to rapidly transition designated ports to the forwarding state.
FIG. 1 is a block diagram of a RSTP BPDU message 100. The BPDU message 100 includes a BPDU message header 102 compatible with the Media Access Control (MAC) layer of the respective LAN standard. The message header 102 comprises a plurality of fields (not shown), such as a destination address (DA) field and a source address (SA) field. The DA field carries a unique bridge multicast destination address assigned to the spanning tree protocol. Appended to header 102 is a BPDU message area 104 that also contains a number of fields, including a protocol identifier (ID) field 106, a protocol version number field 108, a BPDU type field 110, a flags field 112, a root ID field 114, a root path cost field 116, a bridge ID field 118, a port ID field 120, a message age field 122, a maximum age field 124, a hello time field 126, and a forward delay field 128, among others. The root identifier field 114 typically contains the identifier of the bridge assumed to be the root and the bridge identifier field 118 contains the identifier of the bridge sourcing (i.e., sending) the BPDU 100. The root path cost field 116 contains a value representing the cost to reach the assumed root from the port on which the BPDU is sent and the port identifier field 120 contains the identifying number of the port from which the BPDU is sent.
As shown, the flags field 112 carries a plurality of single or multiple bit flags that may be set, e.g., asserted, or cleared, e.g., deasserted. Specifically, the flags field 112 includes a topology change flag 130, a proposal flag 132, a port role flag 134, a learning flag 136, a forwarding flag 138, an agreement flag 140 and a topology change acknowledgment (ACK) flag 142. The learning and forwarding flags 136 and 138 are set to reflect the current port state of the port from which the corresponding BPDU is being sent.
The handshake utilized by adjacent bridges for rapidly transitioning designated ports typically proceeds as follows. When an upstream bridge wishes to rapidly transition a designated port to the forwarding state, it issues a BPDU 100 from that port whose proposal flag 132 is asserted. The port role flag 134 is set to the value associated with the Designated Port Role. In the root ID and root path cost fields 114 and 116, the upstream bridge loads the corresponding information relative to the port from which the BPDU message 100 is to be sent. The upstream bridge then sends the BPDU message which is received at the neighboring downstream bridge.
Assuming the information contained in the BPDU message 100 is equal to or better than that currently stored by the port of the downstream bridge at which the BPDU is received, the downstream bridge asserts “sync” for all of its other bridge ports. Sync is a state machine variable defined by the 802.1w specification standard. Basically, this has the effect of causing the downstream bridge to transition all of its designated ports, other than “edge” ports, to the discarding state. An edge port is defined as a port which provides the only connection to a respective LAN, thereby representing an edge of the bridged network. Once the designated ports have been transitioned to the discarding state, the downstream bridge responds typically through its root port to the upstream bridge with a BPDU message 100 whose agreement flag 140 is asserted. This notifies the upstream bridge that the downstream bridge is in agreement with the respective port of the upstream bridge being transitioned to the forwarding state.
In addition, the designated port(s) of the downstream bridge request permission from their downstream bridges to rapidly transition back to the forwarding state following the same process. That is, BPDU messages 100 with their proposal flags 132 asserted are sent from these ports. In effect, a “cut” is made in the active topology at the first affected designated port and the cut propagates down from this first designated port through all bridges on the subtree below it, i.e., in a direction away from the root, until the cut reaches the edge of the bridged network.