A switch is a packet-forwarding device, such as a bridge (layer 2 switch) or a router (layer 3 switch), that determines the destination of individual data packets (such as Ethernet frames) and selectively forwards them across a network according to the best route for their destination. The best route is associated with one of a number of ports on the packet-forwarding device, which are the device's external interface to the network. The port is a mission critical part of a packet-forwarding device because the port oftentimes is an uplink, collapsing thousands of users in a local area network (LAN) onto a backbone such as the Internet. The port, therefore, becomes a lifeline to all of the LAN's users connected to that port. But the port also is, by nature, a physical link that is made up of cables, connectors, fibre, copper, etc. things that can fail, be cut, and get dirty—not to mention electronics that can break down. Hence, there are many problems that can cause physical links, and thus ports, to fail.
In the past, each port was treated as a separate entity, and there was no concept of a standby port that watched its associated active port and took over in case of failure. Nowadays, ports are often backed up, and the challenge is to back up many ports on a packet-forwarding device in a way that is reliable, efficient, low cost, and, most importantly, very fast.
Prior art methods of backing up ports include providing a separate backup packet-forwarding device, referred to as a standby router. The use of standby routers in an Internet Protocol (IP) network is known in the art. The Internet Engineering Task Force (IETF) has published a draft standard protocol for using standby routers, also referred to as redundant routers, entitled Virtual Router Redundancy Protocol (VRRP), version 2-05, on Jan. 5, 2000. Numerous proprietary protocols also exist, including the Extreme Standby Router Protocol (ESRP), which is part of the switch operating software sold under the trademark “ExtremeWare” by Extreme Networks, Inc., of Santa Clara, Calif., the assignee of the present application.
One of the drawbacks to using redundant routers is that the standby router does not usually kick in until packets have already been dropped due to inoperable links. Also, the takeover process itself is relatively slow, during which time many more packets may be dropped. Using a separate packet-forwarding device as a standby router is also very expensive, costing upwards of $100,000 dollars each. Moreover, introducing another device into the network only adds more equipment that can fail, e.g. additional chassis, power supplies, etc. Thus, the more equipment, the greater the chance of failure, and the more expensive it is to operate the network.
A better alternative is to supply port-level redundancy within a single device, which is cheaper, simpler, and easier to manage and deploy. When deploying port-level redundancy within a single device the ports are backed up on the port level instead of at the switch level, thereby providing the desired level of network resiliency and availability without the complexity of adding another switch or router to the network. The user needs only to install a second network interface card (NIC) in their personal computer or workstation and connect it to the same switch, which in turn, can be connected to other switches. Should the primary data path fail, the redundant data path is available to take over in a very short period of time (typically in sub-seconds), allowing the user to maintain their connection to the backbone.
An example of a prior art technology to provide data path redundancy in a single device is the hardware-based redundancy built into ports at the physical layer level using high speed Ethernet Media Access Control (MAC) integrated circuit (IC) technology, referred to as a redundant PHY. The MAC is the component of a LAN switch that controls communication over an Ethernet link and is used to build high speed LAN switches based on Ethernet, Fast Ethernet and Gigabit Ethernet. A MAC chip having both a primary PHY and a redundant PHY is incorporated into switches sold under the trademark “Summit 48” by Extreme Networks, Inc., of Santa Clara, Calif., the assignee of the present application. For example, the Summit 48 has 48 Fast Ethernet ports and 2 Gigabit Ethernet ports that can be equipped with redundant PHYs.
FIGS. 1A and 1B are simplified block diagrams that illustrate certain aspects of a prior art port using a redundant PHY. A packet-forwarding device 100 connects a local area network LAN 102 serving virtual LANs VLANA 106 and VLANB 108 to network 104. The packet-forwarding device 100 comprises several ports, including the illustrated port 5 110 equipped with a MAC chip 111 having redundant PHY capability to connect the port 5 110 to LAN 102 via a primary link 122 and a redundant link 124. The packet-forwarding device 100 further comprises a switch fabric 112 having a packet forwarder 114, a routing table 116, a bridging table 118, a port description table 119, and other components for carrying out packet-forwarding operations. During normal operation, port 5 110 uses the primary link 122 as the preferred data path, and the redundant PHY is used to switch the physical layer of the MAC chip 111 to use the redundant link 124 only when the primary link 122 fails. Although seemingly straightforward, as a practical matter switching to the redundant link 124 is quite difficult for a number of reasons.
Unlike the prior art standby routers, which are maintained in a hot standby state with their ports connected via active redundant links, the prior art ports equipped with redundant PHYs cannot maintain active redundant links at the same time as active primary links. This is because those routing protocols that are based on a MAC's link state, such as the Open Shortest Path First (OSPF) for IP unicast routing, collectively referred to herein as link state routing protocols (LSRPs), make decisions about the data path over which traffic is forwarded based on the links' states. If the redundant links were active at the same time as the primary links, the LSRPs would get confused about which data path to use for a given port. Therefore, during normal operation, the prior art hardware redundant PHY forces the redundant link 124 down until it is needed, i.e., until the primary link 122 fails or is otherwise determined not to be the best physical data path. But forcing the redundant link 124 down introduces some uncertainty as there are many steps involved in activating a physical link. Therefore, to insure the reliability of activating the redundant link 124 when needed, in addition to being equipped with a redundant PHY the prior art port 5 110 operates in conjunction with a link monitor 120 that uses an Institute of Electrical and Electronics Engineers (IEEE) Ethernet-based auto-negotiation protocol to help monitor the status of the active primary link 122 and inactive redundant link 124 and to set the link state to disabled when it is determined that a link should be forced down.
The auto-negotiation protocol is part of the IEEE standard 802.3 protocol, which was modified in 1995 to include auto-negotiation as part of the adoption of the IEEE 802.3u 100 Mbps Fast Ethernet standard. The auto-negotiation protocol enables devices to negotiate the speed and mode (duplex or half-duplex) when activating an Ethernet link. In the illustrated redundant PHY used in port 5 110 the link monitor 120 uses the auto-negotiation protocol to obtain information about the status of the primary and redundant links 122/124. The link monitor 120 uses the status information in implementing an algorithm to determine whether to deactivate the primary link 122 and fully or partially activate the redundant link 124, and vice versa. For example, if there are five steps to activating a link, then when the auto-negotiation status of the primary link 122 indicates that it is beginning to fail, the link monitor 120 causes the redundant PHY in MAC chip 111 to partially activate the redundant link 124 up to the fourth step. Maintaining the redundant link 124 at the fourth step does not interfere with the LSRP and increases the likelihood that the redundant link 124 can be reliably activated to the fifth and final step when needed, i.e., if and when the primary link 122 ultimately fails.
As illustrated in FIG. 1B, when the redundant link 124 is active, the primary link 122 is inactive, but the routing table 110 in the switch fabric 112 reflects the same port number designation of port 5 for the destination hosts in VLANA 106 and VLANB 108. Thus, even though the physical data path has changed, the port number is the same (i.e., port 5) since the redundant PHY is located in the same port, the routing table 116 remains unchanged, and the packet forwarder 114 in the switch fabric 112 operates as before. As a result, the operation of the redundant PHY capability in port 5 110 is transparent to the switch fabric 112.
One of the drawbacks to using the above-described hardware-based redundant PHY in a port is that redundancy is only provided for those ports equipped with the specialized redundant PHY hardware and devices having the associated link monitor and port switching capability. Upgrading existing equipment can be expensive and impractical, especially in large networks employing equipment from multiple vendors. Moreover, the redundant PHY does not provide “true” port-level redundancy for the port. For example, when the primary link 122 fails because of a problem with an upstream switch, changing over to the redundant PHY doesn't solve the problem since the port is still connected to the bad upstream switch.