Link Aggregation is widely used to aggregate multiple links between a pair of nodes, in order to be able to transmit user data on each of the links participating a Link Aggregation Group (LAG) (see, e.g., IEEE 802.1AX). Aggregating multiple network connections in this fashion can increase throughput beyond what a single connection can sustain, and/or can be used to provide redundancy in case of a failure of one of the links. The “Distributed Resilient Network Interconnect” (DRNI) (see Clause 8 of IEEE 802.1AX-REV/D0.2) specifies extensions to Link Aggregation in order to be able to use link aggregation on a network interface even between more than two nodes, for example between four nodes A, B, C and D as illustrated in FIG. 1. In FIG. 1 and many of the subsequent figures of this application, the label “DRNI” is used to indicate a LAG that includes the four nodes A, B, C and D.
As shown in FIG. 1, a LAG is formed between Network 1 and Network 2. More specifically, a LAG is formed between LAG virtual nodes 30, 32. First LAG virtual node 30 includes a first node (A) and a second node (B). Second LAG virtual node 32 includes a third node (C) and a fourth node (D). LAG Nodes A and C are connected as peer nodes, and LAG Nodes B and D are also connected as peer nodes. Within virtual node 30, nodes A and B are connected as “fellow nodes,” and similarly within virtual node 32 nodes C and D are connected as “fellow nodes.” As used in this application, a “LAG virtual node” refers to a DRNI portal in the IEEE documentation discussed above (i.e., two nodes that appear as a single node to their respective peers). Additionally, the statement that virtual node 30 “includes” two nodes A, B means that the virtual node 30 is emulated by the nodes A, B. Similarly, the statement that virtual node 32 “includes” two nodes C, D means that the virtual node 32 is emulated by the nodes C, D.
Multiple nodes participating in the LAG appear to be the same virtual node with a single System ID to their peering partner in the LAG. The System ID is used to identify each node (e.g., node A, node B, node C, node D). The System ID is included in Link Aggregation Control Protocol Data Units (LACPDUs) sent between the individual nodes of the LAG. It is practical to use the System ID of one of the fellow nodes as a common System ID for their corresponding LAG virtual node. Thus, as shown in FIG. 1, node A and node B belong to the same Network 1 and they are part of the same DRNI Portal (i.e., the same LAG virtual node 30), and use a common System ID of “A” for the emulated LAG virtual node 30. Similarly, Nodes C and D of Network 2 are seen as a single LAG virtual node 32 with a System ID “C” by Nodes A and B.
FIG. 1 also shows the DRNI hand-off of a particular service (see bold “service” line in FIG. 1). The service handed-off on an interface may be a Virtual Local Area Network (VLAN), and an identifier for the service may be a VLAN Identifier (VID), such as a Service VID (i.e., “S-VID”) (typically identifying services on Network to Network Interfaces (NNIs)) or a Customer VID (i.e. “C-VID”) (typically identifying services on User to Network Interfaces (UNIs)). In the example of FIG. 1, the service is handed off on the upper link (between nodes A, C), as both Networks 1 and 2 have selected the upper nodes as “active gateway nodes” and have selected the upper link for the service hand-off. Throughout this application, active gateway nodes are shown as having a bold boundary. This gateway functionality is introduced by DRNI for loop prevention. Thus, the nodes B and D block the service from being handed-off between the DRNI and their own respective networks.
There are different types of failures that have to be handled by the DRNI. One of them is a “portal node failure” illustrated in FIG. 2 (“portal node” and “LAG node” are being used synonymously in this context). As shown in FIG. 2, Node A experiences a failure and can no longer communicate with Node B or Node C. In the prior art, Node B would start to use its own System ID for the LAG instead of the formerly used common System ID, which in the Example of FIG. 1 was the System ID of Node A. Node C is aware of the unreachability of node A, and node D may be aware of it too. Node C and Node D have to accept the new partner System ID (B) in order to provide LAG connectivity. Graceful name change from the old System ID (A) to the new System ID (B) can be applied for smoother transition and for not dropping and re-establishing an active aggregation (see, e.g., N. Finn, Graceful Name Change in LACP, Std. contrib. 2011, http://www.ieee802.org/1/files/public/docs2011/axbq-nfinn-graceful-name-change-0511-v1.pdf).
The behavior illustrated in FIG. 2 is problematic though, because it is based on changing the System ID of the portal based on the System IDs of the individual nodes that comprise the portal. Correspondingly, this prior art solution provides visibility to the individual systems (i.e., Nodes C and D learn that node A has failed), which goes against a main design principle of the DRNI, which is to hide the details of its internal systems that provide an external view of a single LAG virtual node to its peers. Thus, under DRNI principles, it is desirable to avoid System ID change even if a node failure occurs.
FIG. 3 shows another failure event, in the case when connectivity between nodes on the same portal (i.e., the “portal link”) is broken, causing the link between fellow Nodes A and B to fail. In the prior art, nodes cannot distinguish between portal link and portal node failures, and Node B's reaction to the portal link failure is the same as to the portal node failure explained above (i.e., Node B starts using its own System ID instead of the common System ID). Nevertheless, Node A is up and running and also uses its own System ID in LACPDUs, which is the same as the common System ID, as illustrated in FIG. 2. Nodes C and D then only maintain the links towards the LAG virtual node 30 that use the common System ID (i.e., the links to Node A in this example). The links to the other node are deactivated by nodes C and D by deactivating the links from LAG as illustrated in FIG. 4 (see dotted line between Node B and Node D—this notation will be used throughout this application to indicate a deactivated link).
The situation caused by the portal link failure of FIGS. 3-4 is referred to as a “split brain” (SB), because both nodes A and B emulating a single LAG virtual node 30 are up and running but not connected to each other. If both of the split brain nodes had become active gateway, then a loop would appear. Nevertheless, the peering partner nodes C and D are able to inform the split brain nodes A and B that they are in a split brain situation as shown by FIG. 4. That is, both nodes C and D use an LACPDU to inform their respective peer node that a split brain situation has occurred at the LAG virtual node 30. Thus, neither of the nodes takes over the active gateway role from the other (e.g., Node B does not become the active gateway for the service of FIG. 1). Note that if the connectivity between nodes C and D works properly, they emulate a single node LAG virtual node 32 and both of them are aware of the different System IDs received from their respective peer nodes A and B.
If a split brain situation appears on both sides of the LAG at the same time, then a “Double Split Brain” (DSB) condition (also known as a “Dual Split Brain”) is said to occur (see FIG. 5). If both sides of the LAG are experiencing split brain, then neither LAG virtual node 30, 32 is able to detect the split brain situation of the other LAG virtual node 30, 32, because there is no connection within either portal. Thus the nodes of the same portal cannot notify each other of the fact that they receive different System IDs in their respective LACPDUs, which was the basis of prior art single split brain handling. Therefore, all the nodes will consider their fellow node within the portal to be down, and all the nodes become active gateway for all services. This results in forwarding loop of data frames as illustrated in FIG. 5. No method is available for handling Double split brain situations.