1. Field of the Invention
The present disclosure relates generally to spanning tree protocols, and more particularly to systems and methods for implementing a spanning tree protocol that is distributed among participating elements of a packet switch.
2. Description of Related Art
In packet networks, packet switches are often interconnected in hierarchical arrangements such as the exemplary network 100 shown in FIG. 1. Network 100 includes eight packet switches BR1 to BR8. Each switch has multiple ports (e.g., ports P11-P14 on BR1, ports P21-24 on BR2, etc.) that connect to ports on other switches in the network via links (the links may be, e.g., copper cable, fiber optic links, wireless links, etc.). The network could include other switches as well, and the switches generally include ports other than those shown (some large switches have many hundreds to over a thousand ports). One or more of the switches could also serve as a packet router, and other destination devices such as computer end stations, servers, routers, etc., also connect to the network, but are not shown. Although point-to-point links are shown, some media may allow multiple-access, e.g., more than two devices sharing a link.
To provide redundancy, a network such as network 100 purposely includes multiple possible paths that a packet could follow to move from one point in the network to another. Generally, the switches must agree on only one of these paths for each packet, or the packet may loop and/or be dropped without ever being delivered to its destination. When the switches all agree on a set of such paths for all traffic, the network is said to have “converged.”
Spanning Tree Protocols (STPs) are often used in computer networks to maintain a reconfigurable loop-free network topology within a group of interconnected switches. Such protocols not only allow a network to initially converge, but also provide mechanisms for the network to reconverge to a new topology when links or switches fail or are added to a network. One well-known example of a STP, Rapid Spanning Tree Protocol (RSTP), is described in the IEEE 802.1D-2004 specification, published by the Institute of Electrical and Electronic Engineers and incorporated herein by reference. Some features and terminology from the RSTP specification that will be useful in the following description are introduced here in conjunction with FIG. 1.
According to RSTP, each switch that participates in a spanning tree is assigned a bridge priority, such that the operating switch having the best bridge priority (numerically lowest priority value) will be elected by the spanning tree as the root bridge. Each switch initially assumes that it is the root bridge. The switches advertise their bridge priorities to each other across the bridge-to-bridge links using protocol-specific packets known as Bridge Protocol Data Units (BPDUs). When a switch receives a BPDU advertising a bridge priority higher than the switch's own bridge priority, that switch knows that it will not be elected as the root bridge; conversely, if a switch receives no BPDU advertising a higher bridge priority, it continues to act as if it is the root bridge. Assuming that the switches in network configuration 100 are assigned a bridge priority in the same order as their respective numerical suffixes, all switches in the network will recognize BR1 as having the best bridge priority, and BR1 will become the root bridge for the configuration.
The ports of the bridges are each assigned a port role by the protocol. Through the interchange of BPDUs, the switches also learn which one, if any, of their ports will assume the role of a root port—the port that is closest to the root bridge in terms of port cost. Assume that all links shown in each tier of network 100 run at the same data rate and thus all ports on the same tier have the same single-link port cost, the root ports that will be selected by BR2-BR8 based on port cost are indicated by open circles adjacent the ports. In cases where two ports have the same port cost to reach the root bridge (for instance ports P23 and P24 on switch BR2), the bridge with the numerically lowest port number is selected as the root port. The root ports selected in network 100 include P23, P31, P41, P51, P61, P71, and P81.
The interchange of BPDUs also allows the switches to assign designated ports for each network segment—a designated port role is assigned to the port capable of sending the best BPDUs on the segment to which it is connected, where “best” generally means smallest port cost, with bridge priority/port priority used to break ties. The designated ports that will be selected by BR1-BR8 are indicated by black circles adjacent the ports. Thus all ports on BR1 connected to other switches in the network (ports P11, P12, P13, and P14) become designated ports, as their port cost to reach the root bridge is zero. The other designated ports selected in the network include P21, P22, P33-36, P43-46, P53-55, P63-65, P73-75, and P83-85.
The remaining ports shown in network 100—in this case ports P24, P32, P42, P52, P62, P72, and P82—are neither root nor designated ports. These ports assume a role of alternate ports and are illustrated in FIG. 1 with a bar drawn through the network connection near the port. Of course, a port can also become a disabled port if it malfunctions, the link and/or device to which the link connects malfunctions, or if manually disabled. A disabled port does not participate in any spanning tree calculations.
In addition to a port role, each port in the spanning tree also progresses through different port states as the spanning tree converges. Ports progress from a blocking state to a learning state to a forwarding state if they are root or designated ports, and remain in a blocking state if they are alternate ports. Blocking ports do not learn MAC (Media Access Control) addresses from received frames and do not forward frames. Based on these definitions, it can be verified that traffic received on each port of network 100 has a single path comprising only forwarding ports that it may follow to reach any other device (or network egress port) in the network.
One drawback of the FIG. 1 configuration is that it defines a common spanning tree for all traffic. The alternate/blocked ports in the topology represent unused bandwidth that could be useful, such as the bandwidth idle at ports P32, P42, and P24, were there a way to use it. With the advent of Virtual LANs (VLANs), this became possible for switches capable of using different Spanning Trees for different VLANs. Although other similar solutions exist, the most common multi-spanning tree solution is Multiple Spanning Tree Protocol (MSTP), defined in IEEE 802.1Q-2005, and incorporated herein by reference. MSTP allows a set of switches in a defined region to run multiple spanning tree overlays or instances on the same group of switches. In some of the spanning tree instances, a given port will be blocking, while in others, the same port will be forwarding. Thus traffic on different VLANs may follow different paths between the same two points in the network, even though traffic on each VLAN is confined to a single path as described for RSTP. This allows the network to achieve at least a rudimentary form of load balancing and utilize all ports.
In MSTP, each VLAN is assigned to one of 64 logical spanning tree instances. This is accomplished by populating a 4096-element table on each switch with an association between each of the 4096 possible VLAN IDs and one of the 64 logical spanning tree instances. When a switch receives a VLAN-tagged packet, it reads the VLAN ID for the packet and refers to the MSTP table (or a table derived from the MSTP table) to determine the appropriate spanning tree instance (e.g., forwarding port(s)) for that VLAN ID.
Within an MSTP region, only one set of BPDUs is propagated by each switch. The BPDU format for MSTP contains a fixed first section, followed by a variable number of configuration messages, one per MST instance. The BPDU format is as follows:
Protocol IdentifierProtocol Version IdentifierBPDU TypeCIST FlagsCIST Root IdentifierCIST External Path CostCIST Regional Root IdentifierCIST Port IdentifierMessage AgeMax AgeHello TimeForward DelayVersion 1 Length = 0Version 3 LengthMST Configuration IdentifierCIST Internal Root Path CostCIST Bridge IdentifierCIST Remaining Hopsfollowed by a variable number of configuration messages of the following format:
MSTI FlagsMSTI Regional Root IdentifierMSTI Internal Root Path CostMSTI Bridge PriorityMSTI Port PriorityMSTI Remaining HopsThe fixed first section contains information that is used to establish a Common Internal Spanning Tree (CIST) that will be used within the MSTP region as a default spanning tree for traffic not otherwise assigned, and represents the MSTP region as a virtual bridge to the outside world. Many of these fields correspond to RSTP fields in BPDUs used to establish an RSTP Spanning Tree. The MST configuration identifier field, however, identifies the MST group by an alphanumeric configuration name, a configuration revision number, and a digest value. Each switch calculates its MSTP digest value by hashing its VLAN-to-MSTP-instance mapping table with a known hash function, the “HMAC-MD5” algorithm described in Internet Engineering Task Force document RFC 2104. The digest value transmitted within a BPDU must match the internally-calculated digest value in order for a switch to recognize the BPDU as one originating from its MSTP region. Thus if the VLAN mapping tables for two connected switches do not match exactly, the two switches will transmit BPDUs with different digest values. Consequently, the two switches will not cooperate in a common MSTP region, and each assumes that the port on which it receives the differing digest value (or no digest value) is at the MSTP region boundary.
Assuming that the MST configuration identifier digest matches a switch's digest, the switch will participate in establishing a CIST for the region and a number of MST Instances (MSTIs) equal to the number of MSTI configuration messages. For each MSTI, the corresponding MSTI Configuration Message contains bridge and port priorities used to calculate a spanning tree for that instance. By assigning different bridge and/or port priorities in different MSTIs, the MSTIs may be designed to elect different root bridges and may each block and forward on different port combinations to achieve a more load-balanced topology.
A switch may run standard STP, RSTP, MSTP, or some other variant of these, or even multiple STP processes and/or varieties for different ports. Generally, however, all STP variants follow the same general bridge framework 200, shown in FIG. 2. Simplified to a two-layer bridge, a switch is comprised of MAC (Media Access Control) entities for each port (ME1 and ME2), physical entities for each port (PHY1 and PHY2), a MAC relay entity (MRE), and higher-layer entities such as the bridge protocol entity BPE.
MAC entities ME1 and ME2 receive framed packets from their respective physical layer devices, and transmit framed packets on their respective physical layer devices. The logical link control (LLC) sublayer of each MAC entity is responsible for multiplexing/demultiplexing packets with protocols corresponding to the higher-layer entities with the regular traffic that passes through the bridge. Thus LLC1 and LLC2, respectively, pass spanning tree protocol BPDUs from the frame receive functions to bridge protocol entity BPE, and from bridge protocol entity BPE to the frame transmit functions.
The bridge protocol entity BPE operates, e.g. as described above for RSTP and/or MSTP, to determine port roles and port states for the ports represented by ME1, ME2, and any other switch ports in the device. The bridge protocol entity uses this information to set port state INFO1 and INFO2 for each port in the MAC relay entity MRE. If the port state indicates the port is enabled but discarding, frames passed from the MAC entity to the MAC relay entity are dropped. In the port state indicates the port is forwarding, the MAC relay entity uses a filtering database to look up one or more forwarding ports for a frame passed to it from a MAC entity, using e.g., a destination MAC address or VLAN ID to match a filtering database entry. As frames pass through the MAC relay entity, a learning process associates the frames' source MAC addresses with the ports on which the frames were received in order to update the filtering database and learn new station IDs.