1. Field of the Invention
This invention relates to computer networks and, more specifically, to large-scale networks.
2. Background Information
Many organizations, including businesses, governments and educational institutions, utilize computer networks so that employees and others may share and exchange information and/or resources. A computer network typically comprises a plurality of entities interconnected by means of one or more communications media. An entity may consist of any device, such as a computer, that “sources” (i.e., transmits) or “sinks” (i.e., receives) data frames over the communications media. A common type of computer network is a local area network (“LAN”) which typically refers to a privately owned network within a single building or campus. LANs typically employ a data communication protocol (LAN standard), such as Ethernet, FDDI or token ring, that defines the functions performed by data link and physical layers of a communications architecture (i.e., a protocol stack).
One or more intermediate network devices are often used to couple LANs together and allow the corresponding entities to exchange information. For example, a bridge may be used to provide a “switching” function between two or more LANs or end stations. Typically, the bridge is a computer and includes a plurality of ports that are coupled via LANs either to other bridges, or to end stations such as routers or host computers. Ports used to couple bridges to each other are generally referred to as a trunk ports, whereas ports used to couple bridges to end stations are generally referred to as access ports. The bridging function includes receiving data from a sending entity at a source port and transferring that data to at least one destination port for forwarding to one or more receiving entities.
Ethernet
Ethernet is one of the most common LAN standards used today. The original Ethernet transmission standard, referred to as 10Base-T, is capable of transmitting data at 10 Megabits per second (Mbs). In 1995, the Institute of Electrical and Electronics Engineers (IEEE) approved a Fast Ethernet transmission standard, referred to as 100Base-T, which is capable of operating at 100 Mbs. Both 10Base-T and 100Base-T, however, are limited to cable lengths that are less than 100 meters. A committee of the IEEE, known as the 802.3z committee, is currently working on Gigabit Ethernet, also referred to as 1000Base-X (fiber channel) and 1000Base-T (long haul copper), for transmitting data at 1000 Mbs. In addition to the substantially increased transmission rate, Gigabit Ethernet also supports cable lengths of up to 3000 meters. Gigabit Ethernet thus represents a potentially significant increase in the size or range of Ethernet LANs.
Spanning Tree Algorithm
Most computer networks include redundant communications paths so that a failure of any given link does not isolate any portion of the network. Such networks are typically referred to as meshed or partially meshed networks. The existence of redundant links, however, may cause the formation of circuitous paths or “loops” within the network. Loops are highly undesirable because data frames may traverse the loops indefinitely.
Furthermore, some devices, such as bridges or switches, replicate frames whose destination is not known resulting in a proliferation of data frames along loops. The resulting traffic can overwhelm the network. Other intermediate devices, such as routers, that operate at higher layers within the protocol stack, such as the Internetwork Layer of the Transmission Control Protocol/Internet Protocol (“TCP/IP”) reference model, deliver data frames and learn the addresses of entities on the network differently than most bridges or switches, such that routers are generally not susceptible to sustained looping to problems.
To avoid the formation of loops, most bridges and switches execute a spanning tree protocol which allows them to calculate an active network topology that is loop-free (i.e., a tree) and yet connects every pair of LANs within the network (i.e., the tree is spanning). The IEEE has promulgated a standard (IEEE Std. 802.1D-1998™) that defines a spanning tree protocol to be executed by 802.1D compatible devices. In general, by executing the 802.1D spanning tree protocol, bridges elect a single bridge within the bridged network to be the “Root Bridge”. The 802.1D standard takes advantage of the fact that each bridge has a unique numerical identifier (bridge ID) by specifying that the Root Bridge is the bridge with the lowest bridge ID. In addition, for each LAN coupled to any bridge, exactly one port (the “Designated Port”) on one bridge (the “Designated Bridge”) is elected. The Designated Bridge is typically the one closest to the Root Bridge. All ports on the Root Bridge are Designated Ports, and the Root Bridge is the Designated Bridge on all the LANs to which it has ports.
Each non-Root Bridge also selects one port from among its non-Designated Ports (its “Root Port”) which gives the lowest cost path to the Root Bridge. The Root Ports and Designated Ports are selected for inclusion in the active topology and are placed in a forwarding state so that data frames may be forwarded to and from these ports and thus onto the LANs interconnecting the bridges and end stations of the network. Ports not included within the active topology are placed in a blocking state. When a port is in the blocking state, data frames will not be forwarded to or received from the port. A network administrator may also exclude a port from the spanning tree by placing it in a disabled state.
To obtain the information necessary to run the spanning tree protocol, bridges exchange special messages called configuration bridge protocol data unit (BPDU) messages or simply BPDUs. BPDUs carry information, such as assumed root and lowest root path cost, used in computing the active topology. More specifically, upon start-up, each bridge initially assumes itself to be the Root Bridge and transmits BPDUs accordingly. Upon receipt of a BPDU from a neighboring device, its contents are examined and compared with similar information (e.g., assumed root and lowest root path cost) stored by to the receiving bridge in memory. If the information from the received BPDU is “better” than the stored information, the bridge adopts the better information and uses it in the BPDUs that it sends (adding the cost associated with the receiving port to the root path cost) from its ports, other than the port on which the “better” information was received. Although BPDUs are not forwarded by bridges, the identifier of the Root Bridge is eventually propagated to and adopted by all bridges as described above, allowing them to select their Root Port and any Designated Port(s).
In order to adapt the active topology to changes and failures, the Root Bridge periodically (e.g., every hello time) transmits BPDUs. In response to receiving BPDUs on their Root Ports, bridges transmit their own BPDUs from their Designated Ports, if any. Thus, BPDUs are periodically propagated throughout the bridged network, confirming the active topology. As BPDU information is updated and/or timed-out and the active topology is re-calculated, ports may transition from the blocking state to the forwarding state and vice versa. That is, as a result of new BPDU information, a previously blocked port may learn that it should be in the forwarding state (e.g., it is now the Root Port or a Designated Port).
Rapid Spanning Tree Protocol
Recently, the IEEE issued a new version of the 802.1D standard, known as IEEE Std. 802.1D-2004, that describes a rapid spanning tree protocol (RSTP) to be executed by otherwise 802.1D compatible devices. The RSTP similarly selects one bridge of a bridged network to be the Root Bridge and defines an active topology that provides complete connectivity among the LANs while severing any loops. Each individual port of each bridge is assigned a port role according to whether the port is to be part of the active topology. The port roles defined by the 802.1w specification standard include Root, Designated, Alternate and Backup. The bridge port offering the best, e.g., lowest cost, path to the Root Port is assigned the Root Port Role. Each bridge port offering an alternative, e.g., higher cost, path to the Root Bridge is assigned the Alternate Port Role. For each LAN, the one port providing the lowest cost path to the Root Bridge from that LAN is assigned the Designated Port Role, while all other ports coupled to the LAN are assigned the Root, Backup or, in some cases, the Alternate Port Role. At the Root Bridge, all ports are assigned the Designated Port Role.
Those ports that have been assigned the Root Port and Designated Port Roles are placed in the forwarding state, while ports assigned the Alternate and Backup Roles are placed in a state. A port assigned the Root Port Role can be rapidly transitioned to the forwarding state provided that all of the ports assigned the Alternate Port Role are placed in the blocking state. Similarly, if a failure occurs on the port currently assigned the Root Port Role, a port assigned the Alternate Port Role can be reassigned to the Root Port Role and rapidly transitioned to the forwarding state, provided that the previous Root Port has been transitioned to the discarding or blocking state. A port assigned the Designated Port Role or a Backup Port that is to be reassigned to the Designated Port Role can be rapidly transitioned to the forwarding state, provided that the roles of the ports of the downstream bridge are consistent with this port being assigned the Designated Port Role. The RSTP provides an explicit handshake to be used by neighboring bridges to confirm that a new Designated Port can rapidly transition to the forwarding state.
Like the STP described in the IEEE Std. 802.1D-1998 specification standard, bridges running RSTP also exchange BPDUs in order to determine which roles to assign to the bridge's ports. The BPDUs are also utilized in the handshake employed to rapidly transition Designated Ports to the forwarding state.
Virtual Local Area Networks
A computer network may also be segmented into a series of logical networks. For example, U.S. Pat. No. 5,394,402, issued Feb. 28, 1995 to Ross (the “'402 patent”), discloses an arrangement for associating any port of a switch with any particular network segment. Specifically, according to the '402 patent, any number of physical ports of a particular switch may be associated with any number of groups within the switch by using a virtual local area network (VLAN) arrangement that virtually associates the port with a particular VLAN designation. More specifically, the switch or hub associates VLAN designations with its ports and further associates those VLAN designations with messages transmitted from any of the ports to which the VLAN designation has been assigned.
The VLAN designation for each port is stored in a memory portion of the switch such that every time a message is received on a given access port the VLAN designation for that port is associated with the message. Association is accomplished by a flow processing element which looks up the VLAN designation in the memory portion based on the particular access port at which the message was received. In many cases, it may be desirable to interconnect a plurality of these switches in order to extend the VLAN associations of ports in the network. Those entities having the same VLAN designation function as if they are all part of the same LAN. VLAN-configured bridges are specifically configured to prevent message exchanges between parts of the network having different VLAN designations in order to preserve the boundaries of each VLAN. Nonetheless, intermediate network devices operating above L2, such as routers, can relay messages between different VLAN segments.
In addition to the '402 patent, the IEEE promulgated the 802.1Q specification standard for Virtual Bridged Local Area Networks. To preserve VLAN associations of messages transported across trunks or links in VLAN-aware networks, both Ross and the IEEE Std. 802.1Q-1998 specification standard disclose appending a VLAN identifier (VID) field to the corresponding frames. In addition, U.S. Pat. No. 5,742,604 to Edsall et al. (the “'604 patent”), which is commonly owned with the present application, discloses an Interswitch Link (ISL) encapsulation mechanism for efficiently transporting packets or frames, including VLAN-modified frames, between switches while maintaining the VLAN association of the frames. In particular, an ISL link, which may utilize the Fast Ethernet standard, connects ISL interface circuitry disposed at each switch. The transmitting ISL circuitry encapsulates the frame being transported within an ISL header and ISL error detection information, while the ISL receiving circuitry strips off this information and recovers the original frame.
Multiple Spanning Tree Protocol
The IEEE has also promulgated a specification standard for a Spanning Tree Protocol that is specifically designed for use with networks that support VLANs. The Multiple Spanning Tree Protocol (MSTP), which is described in the IEEE Std. 802.1Q-2003, organizes a bridged network into regions. Within each region, MSTP establishes an Internal Spanning Tree (IST) which provides connectivity to all bridges within the respective region and to the ISTs established within other regions. The IST established within each MSTP Region also provides connectivity to the one Common Spanning Tree (CST) established outside of the MSTP regions by bridges running STP or RSTP. The IST of a given MST Region receives and sends BPDUs to the CST. Accordingly, all bridges of the bridged network are connected by a single Common and Internal Spanning Tree (CIST). From the point of view of the legacy or IEEE Std. 802.1Q-1998 bridges, moreover, each MST Region appears as a single virtual bridge on the CST.
Within each MST Region, the MSTP compatible bridges establish a plurality of active topologies, each of which is called a Multiple Spanning Tree Instance (MSTI). The MSTP bridges also assign or map each VLAN to one and only one of the MSTIs. Because VLANs may be assigned to different MSTIs, frames associated with different VLANs can take different paths through an MSTP Region. The bridges may but typically do not compute a separate topology for every single VLAN, thereby conserving processor and memory resources. Each MSTI is basically a simple RSTP instance that exists only inside the respective Region, and the MSTIs do not interact outside of the Region.
MSTP, like the other spanning tree protocols, uses BPDUs to establish the ISTs and MSTIs as well as to define the boundaries of the different MSTP Regions. The bridges do not send separate BPDUs for each MSTI. Instead, every MSTP BPDU carries the information needed to compute the active topology for all of the MSTIs defined with the respective Region. Each MSTI, moreover, has a corresponding Identifier (ID) and the MSTI IDs are encoded into the bridge IDs. That is, each bridge has a unique ID, as described above, and this ID is made up of a fixed portion and a settable portion. With MSTP, the settable portion of a bridge's ID is further organized to include a system ID extension. The system ID extension corresponds to the MSTI ID. The MSTP compatible bridges within a given Region will thus have a different bridge ID for each MSTI. For a given MSTI, the bridge having the lowest bridge ID for that instance is elected the root. Thus, an MSTP compatible bridge may be the root for one MSTI but not another within a given MSTP Region.
Each bridge running MSTP also has a single MST Configuration Identifier (ID) that consists of three attributes: an alphanumeric configuration name, a revision level and a VLAN mapping table that associates each of the potential 4096 VLANs to a corresponding MSTI. Each bridge, moreover loads its MST Configuration ID into the BPDUs sourced by the bridge. Because bridges only need to know whether or not they are in the same MST Region, they do not propagate the actual VLAN to MSTI tables in their BPDUs. Instead, the MST BPDUs carry only a digest of the VLAN to MSTI table or mappings. The digest is generated by applying the well-know MD-5 algorithm to the VLAN to MSTI table. When a bridge receives an MST BPDU, it extracts the MST Configuration ID contained therein, including the digest, and compares it to its own MST Configuration ID to determine whether it is in the same MST Region as the bridge that sent the MST BPDU. If the two MST Configuration IDs are the same, then the two bridges are in the same MST Region. If, however, the two MST Configuration IDs have at least one non-matching attribute, i.e., either different configuration names, different revision levels and/or different computed digests, then the bridge that received the BPDU concludes that it is in a different MST Region than the bridge that sourced the BPDU. A port of an MST bridge, moreover, is considered to be at the boundary of an MST Region if the Designated Bridge is in a different MST Region or if the port receives legacy BPDUs.
FIG. 1 is a highly schematic block diagram of an MST BPDU 100. The MST BPDU 100 includes a header 102 compatible with the Media Access Control (MAC) layer of the respective LAN standard, e.g., Ethernet. The header 102 comprises a destination address (DA) field, a source address (SA) field, a Destination Service Access Point (DSAP) field, and a Source Service Access Point (SSAP), among others. The DA field 104 carries a unique bridge multicast destination address assigned to the spanning tree protocol, and the DSAP and SSAP fields carry standardized identifiers assigned to the spanning tree protocol. Appended to header 102 is a BPDU message area that includes an “outer” part 104 and an “inner” part 106. The outer part 104 has the same format as an RSTP BPDU message and is recognized as a valid RSTP BPDU message by bridges that do not implement MSTP. The “inner” part 106 is utilized by bridges executing MSTP to establish the IST and the MSTIs. The inner part 106 has a set of spanning tree parameters for the IST and a set of parameters for each MSTI supported by the bridge sourcing the MSTP BPDU 100.
Outer part 104, also referred to as the CIST priority vector, has a plurality of fields, including a protocol identifier (ID) field 108, a protocol version ID field 110, a BPDU type field 112, a flags field 114, a CIST root ID field 116, an external path cost field 118, a CIST regional root ID field 120, a CIST port ID field 122, a message age field 124, a maximum (MAX) age field 126, a hello time field 128, and a forward delay field 130. The CIST root identifier field 116 contains the identifier of the bridge assumed to be the root of the Common and Internal Spanning Tree, which may be in the same MSTP Region as the bridge sourcing the BPDU message 100, in another MSTP Region or in part of the bridged network that is not running MSTP. The external path cost field 118 contains a value representing the lowest cost from the bridge sourcing the BPDU 100 to the CIST root identified in field 116 without passing through any other bridge in the same region as the bridge that is sourcing the BPDU message 100.
Inner part 106, also referred to as an MSTI priority vector, similarly has a plurality of fields, including a version 1 length field 132, a null field 134, a version 3 length field 136, an MST configuration ID field 138, a CIST regional root ID field 140, a CIST regional path cost field 142, a CIST bridge ID field 144, a CIST port ID field 146, a CIST flags field 148, and a CIST hops field 150. Inner part 106 may further include one or more optional MSTI configuration messages 152, each of which constitutes another MSTI priority vector or M-record.
Because version 2 of the RSTP does not specify any additional fields beyond those already specified by version 1, the MST BPDU does not have a version 2 length field.
As mentioned above, the MST configuration ID field 138 is made up of three subfields: a configuration name sub-field 154, a revision level sub-field 156 and an MD-5 checksum sub-field 158. The configuration name sub-field 154 carries a variable length text string encoded within a fixed size, e.g., 32-octets. The revision level sub-field 156 carries an integer encoded within a fixed field of two octets. The MD-5 checksum sub-field 158 carries a 16-octet signature created by applying the MD-5 algorithm to the bridge's VLAN to MSTI table, which contains 4096 consecutive two octet elements.
Each MSTI Configuration Message 152 consists of a plurality of fields including a MSTI regional root ID field 160, a MSTI regional path cost field 162, a MSTI bridge ID field 164, a MSTI port ID field 166, a MSTI flags field 168 and a MSTI hops field 170. MST bridges utilize the STP parameters contained in fields 140-150 of inner part 106 and in each MSTI configuration message 152 to compute an active topology for each MSTI configured in the respective region.
Large Scale Computer Networks
Multiple LANs and/or end stations may be interconnected by point-to-point links, microwave transceivers, satellite hook-ups, etc. to form wide area networks (WANs) or metropolitan area networks (MANs) that may span several city blocks, an entire city or an entire continent. A WAN or MAN typically interconnects multiple LANs and/or end stations located at individual campuses and/or buildings that are physically remote from each other, but that are still within the metropolitan area. Conventional WANs and MANs rely on network equipment employing Asynchronous Transfer Mode (ATM) running over the existing Public Switched Telephone Network's (PSTN's) Synchronous Optical Network (SONET). As most LANs utilize the Ethernet standard, network messages or packets created at one LAN must be converted from Ethernet format into ATM cells for transmission over the SONET links. The ATM cells must then be converted back into Ethernet format for delivery to the destination LAN or end station. The need to convert each network message from Ethernet to ATM and back again requires the WAN or MAN to include expensive networking equipment. The WAN or MAN Provider also has to lease or otherwise obtain access to the SONET links. As a result, WANs and MANs can be expensive to build and operate.
Accordingly, a need exists for a system and method for building and operating large-scale computer networks more efficiently.