1. Field of the Invention
The present invention relates generally to computer networks and, more specifically, to a protocol for defining multiple instances of loop-free paths within a computer network.
2. Background Information
Many organizations, including businesses, governments and educational institutions, utilize computer networks so that employees and others may share and exchange information and/or resources. A computer network typically comprises a plurality of entities interconnected by means of one or more communications media. An entity may consist of any device, such as a computer, that “sources” (i.e., transmits) or “sinks” (i.e., receives) data frames over the communications media. A common type of computer network is a local area network (“LAN”) which typically refers to a privately owned network within a single building or campus. LANs typically employ a data communication protocol (LAN standard), such as Ethernet, FDDI or token ring, that defines the functions performed by data link and physical layers of a communications architecture (i.e., a protocol stack). In many instances, several LANs may be interconnected by point-to-point links, microwave transceivers, satellite hook-ups, etc. to form a wide area network (“WAN”) or internet that may span an entire country or continent.
One or more intermediate devices is often used to couple LANs together and allow the corresponding entities to exchange information. For example, a switch may be utilized to provide a “switching” function for transferring information, such as data frames, among entities of a computer network. Typically, the switch is a computer and includes a plurality of ports that couple the switch to the other entities. The switching function includes receiving data at a source port from an entity and transferring that data to at least one destination port for receipt by another entity.
In addition, most computer networks include redundant communications paths so that a failure of any given link does not isolate any portion of the network. Such networks are typically referred to as meshed or partially meshed networks. The existence of redundant links, however, may cause the formation of circuitous paths or “loops” within the network. Loops are highly undesirable because data frames may traverse the loops indefinitely.
Furthermore, some devices, such as bridges or switches, replicate frames whose destination is not known resulting in a proliferation of data frames along loops. The resulting traffic effectively overwhelms the network. Other intermediate devices, such as routers, that operate at higher layers within the protocol stack, such as the Internetwork Layer of the Transmission Control Protocol/Internet Protocol (“TCP/IP”) reference model, deliver data frames and learn the addresses of entities on the network differently than most bridges or switches, such that routers are generally not susceptible to sustained looping problems.
Spanning Tree Algorithm
To avoid the formation of loops, most bridges and switches execute a spanning tree algorithm which allows them to calculate an active network topology that is loop-free (i.e., a tree) and yet connects every pair of LANs within the network (i.e., the tree is spanning). The Institute of Electrical and Electronics Engineers (IEEE) has promulgated a standard (the 802.1D standard) that defines a spanning tree protocol to be executed by 802.1D compatible devices. In general, by executing the IEEE spanning tree protocol, bridges elect a single bridge to be the “root” bridge. Since each bridge has a unique numerical identifier (bridge ID), the root is typically the bridge with the lowest bridge ID. In addition, for each LAN coupled to more than one bridge, only one (the “designated bridge”) is elected to forward frames to and from the respective LAN. The designated bridge is typically the one closest to the root. Each bridge also selects one port (its “root port”) which gives the lowest cost path to the root. The root ports and designated bridge ports are selected for inclusion in the active topology and are placed in a forwarding state so that data frames may be forwarded to and from these ports and thus onto the corresponding paths or links of the network. Ports not included within the active topology are placed in a blocking state. When a port is in the blocking state, data frames will not be forwarded to or received from the port. A network administrator may also exclude a port from the spanning tree by placing it in a disabled state.
To obtain the information necessary to run the spanning tree protocol, bridges exchange special messages called configuration bridge protocol data unit (BPDU) messages. FIG. 1 is a block diagram of a conventional BPDU message 100. The BPDU message 100 includes a header 102 compatible with the Media Access Control (MAC) layer of the respective LAN standard. The header 102 comprises a destination address (DA) field 104, a source address (SA) field 106, a Destination Service Access Point (DSAP) field 108, and a Source Service Access Point (SSAP) 110, among others. The DA field 104 carries a unique bridge multicast destination address assigned to the spanning tree protocol, and the DSAP and SSAP fields 108, 110 carry standardized identifiers assigned to the spanning tree protocol. Appended to header 102 is a BPDU message area 112 that also contains a number of fields, including a Topology Change Acknowledgement (TCA) flag 114, a Topology Change (TC) flag 116, a root identifier (ROOT ID) field 118, a root path cost field 120, a bridge identifier (BRIDGE ID) field 122, a port identifier (PORT ID) field 124, a message age (MSG AGE) field 126, a maximum age (MAX AGE) field 128, a hello time field 130, and a forward delay (FWD DELAY) field 132, among others. The root identifier field 118 typically contains the identifier of the bridge assumed to be the root and the bridge identifier field 122 contains the identifier of the bridge sourcing (i.e., sending) the BPDU. The root path cost field 120 contains a value representing the cost to reach the assumed root from the port on which the BPDU is sent and the port identifier field 122 contains the port number of the port on which the BPDU is sent.
Upon start-up, each bridge initially assumes itself to the be the root and transmits BPDU messages accordingly. Upon receipt of a BPDU message from a neighboring device, its contents are examined and compared with similar information (e.g., assumed root and lowest root path cost) stored by the receiving bridge in non-recoverable memory. If the information from the received BPDU is “better” than the stored information, the bridge adopts the better information and uses it in the BPDUs that it sends (adding the cost associated with the receiving port to the root path cost) from its ports, other than the port on which the “better” information was received. Although BPDU messages are not forwarded by bridges, the identifier of the root is eventually propagated to and adopted by all bridges as described above, allowing them to select their root port and any designated port(s).
In order to adapt the active topology to failures, the root periodically (e.g., every hello time) transmits BPDU messages. The hello time utilized by the root is also carried in the hello time field 128 of its BPDU messages. The default hello time is 2 seconds. In response to receiving BPDUs on their root ports, bridges transmit their own BPDUs from their designated ports, if any. Thus, every two seconds BPDUs are propagated throughout the bridged network, confirming the active topology. As shown in FIG. 1, BPDU messages stored by the bridges also include a message age field 124 which corresponds to the time since the root instigated this generation of BPDU information. That is, BPDU messages from the root have their message age field 124 set to “0”. Thus, every hello time, BPDU messages with a message age of “0” are propagated to and stored by the bridges.
After storing these BPDU messages, bridges proceed to increment the message age value. When the next BPDU message is received, the bridge examines the contents of the message age field 124 to determine whether it is smaller than the message age of its stored BPDU message. Assuming the received BPDU message originated from the root and thus has a message age of “0”, the received BPDU message is considered to be “better” than the stored BPDU information (whose message age has presumably been incremented to “2” seconds) and, in response, the bridge proceeds to re-calculate the root, root path cost and root port based upon the received BPDU information. The bridge also stores this received BPDU message and proceeds to increment its message age field 124. If the message age of a stored BPDU message reaches a maximum age value, the corresponding BPDU information is considered to be stale and is discarded by the bridge.
Normally, each bridge replaces its stored BPDU information every hello time, thereby preventing it from being discarded and maintaining the current active topology. If a bridge stops receiving BPDU messages on a given port (indicating a possible link or device failure), it will continue to increment the respective message age value until it reaches the maximum age threshold. The bridge will then discard the stored BPDU information and proceed to re-calculate the root, root path cost and root port by transmitting BPDU messages utilizing the next best information it has. The maximum age value used within the bridged network is typically set by the root, which enters the appropriate value in the maximum age field 126 of its transmitted BPDU messages 100. Neighboring bridges similarly load this value in their BPDU messages, thereby propagating the selected value throughout the network. The default maximum age value under the IEEE standard is twenty seconds.
As BPDU information is up-dated and/or timed-out and the active topology is recalculated, ports may transition from the blocking state to the forwarding state and vice versa. That is, as a result of new BPDU information, a previously blocked port may learn that it should be in the forwarding state (e.g., it is now the root port or a designated port). Rather than transition directly from the blocking state to the forwarding state, ports transition through at least two intermediate states: a listening state and a learning state. In the listening state, a port waits for information indicating that it should return to the blocking state. If, by the end of a preset time, no such information is received, the port transitions to the learning state. In the learning state, a port still blocks the receiving and forwarding of frames, but received frames are examined and the corresponding location information is stored in the filtering database, as described above. At the end of a second preset time, the port transitions from the learning state to the forwarding state, thereby allowing frames to be forwarded to and from the port. The time spent in each of the listening and the learning states is referred to as the forwarding delay and is entered by the root in field 126.
As ports transition between the blocked and forwarding states, entities may appear to move from one port to another. To prevent bridges from distributing messages based upon incorrect address information, bridges quickly age-out and discard the “old” information in their filtering databases. More specifically, upon detection of a change in the active topology, a bridge begins transmitting Topology Change Notification Protocol Data Unit (TCN-PDU) messages on its root port. The format of the TCN-PDU message is well known (see IEEE 802.1D standard) and, thus, will not be described herein. A bridge receiving a TCN-PDU message sends a TCN-PDU of its own from its root port and sets the TCA flag 112 in BPDUs that it sends on the port from which the TCN-PDU was received, thereby acknowledging receipt of the TCN-PDU. By having each bridge send TCN-PDUs from its root port, the TCN-PDU is effectively propagated hop-by-hop from the original bridge up to the root. The root confirms receipt of the TCN-PDU by setting the TC flag 114 in the BPDUs that it subsequently transmits for a period of time. Other bridges, receiving these BPDUs, note that the TC flag 114 has been set, thereby alerting them to the change in the active topology. In response, bridges significantly reduce the aging time associated with their filtering databases which, as described above, contain destination information corresponding to the entities within the network. Specifically, bridges replace the default aging time of five minutes with the forwarding delay time, which by default is fifteen seconds. Information contained in the filtering databases is thus quickly discarded.
Virtual Local Area Networks
A computer network may also be segmented into a series of logical networks. U.S. Pat. No. 5,394,402, issued Feb. 28, 1995 to Ross (the “'402 Patent”), for example, discloses an arrangement for associating any port of a switch with any particular network segment. Specifically, according to the '402 Patent, any number of physical ports of a particular switch may be associated with any number of groups within the switch by using a virtual local area network (VLAN) arrangement that virtually associates the port with a particular VLAN designation. More specifically, the '402 Patent discloses a switch or hub that associates VLAN designations with its ports and further associates those VLAN designations with messages transmitted from any of the ports to which the VLAN designation has been assigned.
The VLAN designation for each port is stored in a memory portion of the switch such that every time a message is received on a given access port the VLAN designation for that port is associated with the message. Association is accomplished by a flow processing element which looks up the VLAN designation in the memory portion based on the particular access port at which the message was received. In many cases, it may be desirable to interconnect a plurality of these switches in order to extend the VLAN associations of ports in the network. The '402 Patent, in fact, states that an objective of its VLAN arrangement is to allow all ports and entities of the network having the same VLAN designation to exchange messages by associating a VLAN designation with each message. Thus, those entities having the same VLAN designation function as if they are all part of the same LAN. Message exchanges between parts of the network having different VLAN designations are specifically prevented in order to preserve the boundaries of each VLAN segment or domain. For convenience, each VLAN designation is often associated with a different color, such as red, blue, green, etc.
In addition to the '402 Patent, the Institute of Electrical and Electronics Engineers (IEEE) has promulgated the 802.1Q standard for Virtual Bridged Local Area Networks. The 802.1Q standard, among other things, defines a specific VLAN-tagged message format for use in VLAN-aware networks.
Having defined a segregated computer network, several “solutions” have been proposed for overlaying spanning trees on these virtually segregated network groups. The IEEE 802.1Q standards committee, for example, has proposed defining a single spanning tree for all VLAN designations in the computer network. That is, the switches exchange conventional BPDUs in the accustomed manner so as to define a single forwarding topology irrespective of the various VLAN designations that have been defined for the network. Thus, either all VLAN tagged frames may be forwarded and received through a given port or no tagged frames may be forwarded or received through the port. Since bridges and switches are typically pre-configured to exchange and process conventional BPDUs, this is a simple solution to implement. It also conserves processing resources by limiting the computations that must be performed to determine the bridged network's active topology.
Nonetheless, the IEEE solution has several drawbacks. For example, by defining a single spanning tree for a network having numerous VLAN designations, the IEEE solution does not allow for load balancing. That is, all data communication within the network follows the single forwarding topology defined by the one spanning tree. This may significantly degrade performance over certain, heavily utilized, portions of the network, limiting message throughput.
An alternative to the 802.1Q single spanning tree approach is to define a separate spanning tree for each VLAN designation within the network. This alternative is currently being implemented within networking equipment from Cisco Systems, Inc. of San Jose, Calif. See Cisco IOS VLAN Services document. With this approach, BPDUs are tagged with each of the VLAN designations defined within the network and exchanged among the switches. That is, as shown in FIG. 1, the BPDU header 102 is modified by adding a VLAN identifier field 134. Upon receipt, these tagged BPDUs are then processed by the switches so as to define a separate active topology or spanning tree for each VLAN designation within the bridged network. Thus, for a given port, messages associated with one VLAN designation may be forwarded and received while messages associated with a second VLAN designation may be blocked.
By defining a separate forwarding topology for each VLAN designation which spans all entities associated with that designation, this solution supports load balancing throughout the network. It also avoids possible lost connectivity problems with portions of the network that may occur with the IEEE solution. There are, nonetheless, other drawbacks. First, this approach may not scale well to large networks. That is, as the number of VLAN designations increases, the number of tagged BPDUs being exchanged correspondingly increases. Accordingly, more communications bandwidth is consumed with BPDU traffic. Each BPDU, moreover, must be processed by the switches so as to calculate the corresponding spanning trees. Depending on the number of VLAN designations within the network, this may severely tax the processing and memory resources of the switches, degrading network efficiency.