1. Field of the Invention
The present disclosure relates generally to link aggregation, and more particularly to systems and methods for implementing link aggregation between two peers connected by a service provider network.
2. Description of Related Art
Link aggregation refers to a process for operating a group of physical links as if they were a single link. At least one standard for link aggregation has been promulgated by the Institute of Electrical and Electronic Engineers, e.g., in the IEEE 802.3-2005 standard, Section 43 and Annexes 43A-C, incorporated herein by reference.
FIG. 1 illustrates one network configuration 100 to which link aggregation is applicable. A first switch 110 comprises four physical layer transponders PHY1-PHY4 communicating respectively with four Media Access Control (MAC) link layer devices MAC1-MAC4. Each MAC communicates frame data with network processing 120. A second switch 130 comprises a similar configuration of four physical layer transponders PHY5-PHY8, four MAC link layer devices MACS-MACS, and network processing 140. Each switch will generally include other PHYs and MACs (not shown) for other ports that are not part of the potential link aggregation group between switches 110 and 130.
Switch 110 and switch 130 are connected by four full-duplex physical links. PHY1 and PHY5 communicate over a link LINK1,5. P1-W2 and PHY6 communicate over a link LINK2,6. PHY3 and PHY7 communicate over a link LINK3,7. PHY4 and PHY8 communicate over a link LINK4,8. Each of these four links can be operated independently, which may be advantageous, for example, in a multiple spanning tree configuration. Alternately, if network processing 120 and network processing 140 possess hardware and/or software necessary to support link aggregation, they can negotiate the aggregation of two (or more) links connecting switches 110 and 130 to a common logical link that appears as a single, faster link.
A short summary of pertinent link aggregation terminology and concepts from the IEEE 802.3-2005 standard is presented. Referring to FIG. 2, several logical components of a packet network device 200 are shown, including a Media Access Control (MAC) client 210, a link aggregation sublayer 220, four individual MACs MAC1-MAC4, and four individual physical layer transponders (PHYs) PHY1-PHY4. The purpose of the link aggregation sublayer 220 is to combine a number of physical ports (represented by MACn/PHYn) logically for presentation to MAC client 210 as a single logical MAC. More or less than four physical ports are supportable by the framework, with up to the same number of MAC clients as physical ports supportable as well.
Link aggregation sublayer 220 is further subdivided into several logical components, including control parser/multiplexers (muxes) CPM1-CPM4, an aggregator 230, and aggregation control 260.
Each control parser/mux CPMn couples to a corresponding MAC MACn across an IEEE 802.3 MAC service interface. For egress frames (transmitted by one of the PHYs), each control parser/mux passes frame transmission requests from aggregator 230 and aggregation control 260 to the appropriate port. For ingress frames (received by one of the PHYs), each control parser/mux distinguishes Link Aggregation Control (LAC) Protocol Data Units (PDUs) from other frames, and passes the LACPDUs to aggregation control 260, with all other frames passing to aggregator 230. It is noted that although one aggregator 230 is shown, in the particular implementation shown in FIG. 2 there could be up to four aggregators—each control parser/mux CPMn passes its non-LACPDU ingress traffic to a particular aggregator bound to MACn, or discards the non-LACPDU traffic when MACn is not bound to an aggregator.
Aggregator 230 comprises a frame collection block 240, a frame distribution block 250, and up to four (in this embodiment) aggregator parser/muxes APM1-APM4. Aggregator 230 communicates with MAC client 210 across an IEEE 802.3 MAC service interface. Aggregator 230 also communicates with each control parser/mux CPMn that corresponds to a MAC MACn bound to aggregator 230.
Frame collection block 240 comprises a frame collector 242 and a marker responder 244. The frame collector 242 receives ordinary traffic frames from each bound MAC MACn and passes these frames to MAC client 210. Frame collector 242 is not constrained as to how it multiplexes frames from its bound ports, other than it is not allowed to reorder frames received on any one port. The marker responder 244 receives marker frames (as defined in IEEE 802.3-2005) from each bound port and responds with a return marker frame to the port that received the ingress marker frame.
Frame distribution block 250 comprises a frame distributor 252 and an optional marker generator/receiver 254. The frame distributor 252 receives ordinary traffic frames from MAC client 210, and employs a frame distribution algorithm to distribute the frames among the ports bound to the aggregator. Frame distributor 252 is not constrained as to how it distributes frames to its bound ports, other than that it is expected to supply frames from the same “conversation” to the same egress port. Marker generator/receiver 254 can be used, e.g., to aid in switching a conversation from one egress port to another egress port. Frame distribution 250 holds or discards any incoming frames for the conversation while marker generator/receiver 254 generates a marker frame on the port handling the conversation. When a return marker frame is received, all in-transit frames for the conversation have been received at the far end of the aggregated link, and frame distribution may switch the conversation to a new egress port.
Aggregator parser/muxes APM1-APM4, when bound to one of the physical ports, transfer frames with their corresponding control parser/mux CPM1-CPM4. On transmit, aggregator parser/muxes APM1-APM4 takes egress frames (ordinary traffic and marker frames) from frame distribution 250 and marker responder 244 and supply them to their respective bound ports. For ingress frames received from their bound port, each aggregator parser/mux distinguishes ordinary MAC traffic, marker request frames, and marker response frames, passing each to frame collector 242, marker responder 244, and marker generator/receiver 254, respectively.
Aggregation control 260 is responsible for configuration and control of link aggregation for its assigned physical ports. Aggregation control 260 comprises a link aggregation control protocol (LACP) handler that is used for automatic communication of aggregation capabilities and status among systems, and a link aggregation controller 264 that allows automatic control of aggregation and coordination with other systems.
The frames exchanged between LACP 262 and its counterparts in peer systems each contain a LAC PDU, e.g., with a format 300 as shown in FIG. 3. The actor information and partner information contained in the LACPDU structure are used to establish and break link aggregations, with the “actor information” pertaining to the system sending the LACPDU, and the “partner information” indicating the state of the system receiving the LACPDU, as understood by the system sending the LACPDU.
The actor and partner information include a system ID, system priority, key, port ID, port priority, and state flags. The system ID is a globally unique identifier such as a MAC address assigned to the system. The system priority is a priority value assigned by the system administrator to the system. The key is a value assigned to the port by its system, and may be static or dynamic. The key is the same for each port on the system that is capable of aggregation with other ports transmitting that key. The port ID is a port number assigned by the system administrator to each port, and should be unique on the system. The port priority is a priority value assigned by the system administrator to the port, and should be unique among ports that are potentially aggregable. The state flags include LACP_Activity, LACP_Timeout, Aggregation, Synchronization, Collecting, Distributing, Defaulted, and Expired, and are defined as specified in IEEE 802.3-2005. In particular, the Synchronization bit is set TRUE when the link has been allocated to the correct Link Aggregation Group (LAG), the group has been associated with a compatible Aggregator, and the identity of the LAG is consistent with the System ID and Key transmitted by the port.
In operation, peered systems exchange LACPDUs to determine whether multiple ports that are aggregable to each other appear on both ends of the same link. To accomplish this, both endpoints calculate a Link Aggregation Group Identifier (LAG ID) for each participating port. The LAG ID combines actor and partner system priorities, system IDs, and keys. When the LAG IDs on two or more aggregable ports match, those ports are automatically assigned to the same LAG group, as long as both link endpoint systems make the aggregation.