Field of Invention
The present invention relates generally to data communication networks and devices, and relates more particularly to routing in virtual link trunking (VLT) systems.
Description of the Related Art
As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
As information handling systems provide increasingly more central and critical operations in modern society, it is important that the networks are reliable. One method used to improve reliability is to provide redundant links between network devices. By employing redundant links, network traffic between two network devices that would normally be interrupted can be re-routed to the back-up link in the event that the primary link fails.
Although having redundant links is helpful for failover situations, it creates network loops, which can be fatal to networks. To remove the loops, a protocol named Spanning Tree Protocol (STP) is often employed. STP is a Layer-2 protocol that runs on network devices, such as bridges and switches, to ensure that loops are not created when there are redundant paths in the network. The result of the STP is that some links are inactive unless a primary link fails. Thus, networks using redundant links with STP have links that are underutilized.
FIG. 1 depicts an example of a networking system 100 that employs Spanning Tree Protocol. Depicted in FIG. 1 is a set of networking devices 105A-105D that are connected to other networks devices 110A and 110B (which may be access switches), which are in turn connected to other network devices 115A and 115B (which may be core switches or routers). The network devices are connected with redundant links. Due to STP, some of the links are active 120 and some of the links are placed into an inactive state 125 to avoid network loops. Because many of the links are placed into an inactive state by the STP, the network capacity is underutilized. To address the limitations of STP, a protocol called the multiple spanning tree protocol (MSTP) was developed by IEEE 802.1 [IEEE 802.1s]. While this protocol allows for more links to be used for forwarding, it still suffers from the limitation of having a loop-free active topology for any given VLAN.
However, ever increasing demands for data have required communication networks to provide more throughput. Not only must networks be reliable, but they must also provide adequate bandwidth. Thus, a key area in which communication networks strive to improve is in increasing capacity (data throughput or bandwidth).
One way to increase capacity through recapturing unused network capacity involves the use of link aggregation. Link aggregation refers to various methods of aggregating network connections to increase data throughput while still supporting fault tolerance in case of failures. Generally, link aggregation involves grouping two or more physical data network links between two network devices into one logical link in which the two or more physical network links may be treated as a single logical link. By using certain link aggregation implementations, the need for STP can be eliminated by increasing the intelligence of network forwarding devices, providing a non-blocking high performance network.
Initial implementation of link aggregation required that the aggregated links terminate on a single switch. However, additional implementation developed that allowed the links to terminate on two switches. An example of a mechanism used to support LAG networking across more than one device is multi-chassis link aggregation (“MLAG”) and distributed resilient network interconnect (DRNI) [IEEE P802.1AX-REV].
MLAG is a LAG implementation in which a LAG terminates on two separate chassis or devices. A MLAG is configured such that one or more links comprising one LAG terminate at ports on a first device and one or more links comprising the same LAG terminate on a second device. The first and second devices are configured so that they appear to the surrounding network to be one logical device. At least one standard for link aggregation has been promulgated by the Institute of Electrical and Electronic Engineers, which is contained in the IEEE 802.1AX-2008 standard, which is incorporated by reference herein. However, a number of different vendors have implemented their own versions. For example, Cisco markets EtherChannel and Port Aggregation Protocol (along with its related Virtual Switching System (VSS), virtual PortChannel (vPC), Multichassis EtherChannel (MEC), and Multichassis Link Aggregation (MLAG)). Avaya markets Multi-Link Trunking (MLT), Split Multi-Link Trunking (SMLT), Routed Split Multi-Link Trunking (RSMLT), and Distributed Split Multi-Link Trunking (DSMLT). ZTE markets “Smartgroup” and Huawei markets “EtherTrunks”. Other vendors provide similar offerings. A standard for this technology is under development in the IEEE 802.1 standards committee; the project is called distributed resilient network interconnect (DRNI).
FIG. 2 depicts an example implementation of a networking system, which is similar to the system in FIG. 1 but which employs link aggregation. Depicted in FIG. 2 is a set of networking devices 205A-205D that are connected to other networks devices 210A and 210B (which may be access switches). In the depicted example, the network devices 205A-205D are connects such that each device 205x has a link aggregation group (LAG) to the switches 210A and 210B. For example, network device 205A has two port connections 220A and 220B that together form link aggregation group 220, as shown in the physical view 200A of FIG. 2. To the network devices 205x having such a link aggregation configuration to the switches, the two switches 210A and 210B may be configured to appear as a single logical switch, as shown in the logical view 200B of FIG. 2.
As noted above, the two switches may optionally be configured to appear as a single logical switch. Multi-chassis link aggregation implementation provide special links (e.g., links 205 between switch 210A and switch 210B) that can be used to connect two separate switches together to form an aggregation switch that in some ways acts like a single larger chassis. With two chassis aggregated in this manner, when a packet arrives at one of the switches that must egress on the other switch, the first switch forwards the packet to a port associated with the special link interconnect where it is transmitted to the other device for transmission over the network.
It must be noted, however, that the current various implementations of link aggregation have serious limitations. First, the current implementations support only two switches configurations connected in a point-to-point fashion. Extending beyond two switches significantly adds complexity in connections, configuration, and operation. For example, it is relatively simple to synchronize data between two devices, but it becomes significantly more complex to synchronize between multiple devices.
Second, at any point in time, within a given aggregation switch only one switch typically operates in a primary switch role, while the remaining switch operates in a secondary role. In the primary role, the primary switch assumes control over at least some of the aggregation switch functionality. Among other things, this can involve the primary switch being responsible for running some Layer-2 network protocols (such as Spanning Tree Protocol (STP)) that assist in the operation of the switch in the network environment. The network information learned by the primary switch can be distributed as needed to the secondary switches in order to synchronize at least some of the states between the primary switch and secondary switch. While running in such as primary-secondary configuration is easy to manage, it does not efficiently utilize network resources.
Third, limiting the number of switches that form the logical switch group does not provide a readily scalable solution. Clients desiring to add infrastructure incrementally need to add pairs of devices rather than simply being able to add any number of switches. Also, clients wanting to extend their current link aggregation system cannot do so because new each switch or pair of switches forms a new domain rather than simply extending an existing domain. Thus, increasing the system involves adding separate link aggregation switch groups that must be separately managed, configured, and operated—needlessly adding complexity and administrative overhead.
Fourth, when pairing switches, vendors generally require that the devices be the same. Having mirrored devices makes it easier for vendors because it limits possible combinations; a vendor therefore does not have to make sure different products interoperate. Also, having homogeneous devices tend to force symmetry in the configuration, for which it is also simpler for vendors to develop and support. However, requiring like switches is rarely the best for clients. As data centers and networks grow, a client would prefer to purchase a single new model device rather than being forced to choose between buying an older model to pair with its current older model or to buy two new models and shelve it current older, but still operational, model. Thus, current multi-chassis link systems inhibit cost effective equipment migration plans.
Another issue with typical multi-chassis link aggregation approaches is that routing takes place at the nodes in the spine layer. Consider the system 200A in FIG. 2. The nodes 210A and 210B act as the spine and the devices 205A-205D are the leaves of the system. Typically, the leaf nodes (e.g., 205A-205D) are not managed as part of the virtual link trunking system. And, typically to minimize configuration and allow VLANs to be present on any of the leaves, the routing is done at the spine level. For inter-VLAN traffic on a single leaf node (e.g., device 205A), the data traffic must traverse to the spine level (e.g., nodes 210A and 210B) to be routed—even though it is returned to the same leaf node. This processing creates inefficiency.
Accordingly, what is needed are systems and methods that can address the deficiencies and limitations of the current multi-chassis link aggregation approaches.