With network switching devices that incorporate certain types and/or brands of ASICs (application specific integrated circuits), there exists the possibility of traffic flooding if link aggregation ports belonging to the same link aggregation (i.e., group of aggregated physical links) span across two ASICs within a common network device. Broadcom (BCM) brand XGS family ASICs is one example of such a type and/or brand of ASICs whose link aggregation hashing algorithm can lead to such traffic flooding problems. Traffic flooding in this manner translates into traffic drop in an operational network, which is highly undesirable.
Referring to FIG. 1, for link aggregation functionality with an ASIC configured for providing network interface functionality (e.g., an Broadcom XGS family ASIC of a line card (i.e., network interface line card), load balancing with a system (e.g., system of network interface apparatuses (e.g., switches) each having one of more network interface units) is implemented for the purpose of distributing traffic across all ports of an aggregate group (i.e., link aggregation group LAG1). To this end, a Broadcom ASIC provides different options for load balancing traffic. One of the common options used is for its link aggregation hashing algorithm to be based on both source MAC address and destination MAC address of an Ethernet frame. Therefore, as shown in FIG. 1, depending on source MAC address and the destination MAC address, traffic between two entities/peers (i.e., Apparatus1 and Apparatus 2) over aggregation links AL1 provided by ASICs 1, 2 and 3 may not flow through the same physical link (i.e., port) in both traffic flow directions (i.e., traffic flow direction for initiation and traffic flow direction for response), though they belong to the same link aggregation group (i.e., LAG1). For traffic flowing from User A to User B (i.e., Frames A→B), the MAC address of User A is the source address and the MAC address of User B is the destination address. Similarly, for traffic flowing from User B to User A (i.e., Frames B→A), the MAC address of User B is the source address and the MAC address of User A is the destination address.
One important aspect relating to Broadcom XGS family ASICs is that the MAC address table-aging timer setting is unique to the whole ASIC. Specifically, a user cannot configure aging time based on port, VLAN (virtual local area network), link aggregation, or per flow basis. Because of the implementation requirements and limitations of this load balancing mechanism, it may introduce flooding problem for traffic over a link aggregation configuration.
Referring to FIG. 2, traffic is initiated from MAC A (e.g., User A) with a destination of MAC B (e.g., User B). Accordingly, frames are sent from MAC A to MAC B. But, due to an unknown destination address of MAC B, MAC A is learned on port locations 1, 2, and 3 due to flooding and Path A is selected for this flow (e.g., due to hashing). MAC B responds, but due to hashing, MAC B takes a different route than Path A (i.e., Path B), which will be learned on port locations 4, 5 and 6. In this example, Path B is selected for traffic flow from MAC B to MAC A. Now, traffic is unicasting in two different paths (i.e., Path A and Path B of Link Aggregation LA1) due to hashing and unicasting along these two paths continues for about 1 aging period. Because frames sent from MAC A to MAC B always take path A, MAC A at port 2 will not be refreshed anymore after initial flooding and will be aged out after roughly 1 aging period. Furthermore, because MAC A is no longer coming through network interface 2 and frames from MAC B to MAC A will be flooded out on network interface 2 according to flood limit setting. Lastly, as long as there is traffic flowing from MAC B to MAC A, MAC A will never be learned at port 2 and flooding will continue.
Still referring to FIG. 2, traffic is initiated from MAC B with a destination of MAC A. Accordingly, frames are sent from MAC B to MAC A. But, due to unknown destination address MAC A, MAC A is learned on port locations 4, 5, and 6 due to flooding. Path B is again selected for this traffic flow. MAC A responds, but, because MAC A is learned on linkagg 1 on location 6 and it is unicast, Path A is chosen. MAC A is now learned on port locations 1 and 3. Now, traffic from MAC B to MAC A will take Path B and MAC A will not be refreshed on network interface 2 and eventually it will be aged out. Therefore, frames from MAC B to MAC A will always flooding out on network interface 2 and traffic from MAC A to MAC B will be unicast.
The current solution to this flooding problem is to have all MAC addresses learned to be synchronized between all network interfaces. This solution can be achieved by synchronizing newly learned MAC addresses over to all ASICs. One example of an interconnect mechanism for providing communication functionality to support such synchronization is offered by Broadcom under the trademarked brand name HiGig, which is a proprietary interconnect mechanism compatible with Broadcom brand ASICS (e.g., Strata XGS family of ASICs). Such an interconnect mechanism allows communication between devices (e.g., ASICs) each having an implementation of the interconnect mechanism in combination therewith. The HiGig protocol supports various switching functions like Quality-of-Service (QoS), link aggregation, etc. After synchronizing the newly learned MAC addresses over to all ASICs, depending on the aging time setting and prior to aging time interval expiration, those MAC addresses that are learned locally are read out and traversed through the whole L2 (i.e., Layer 2) table periodically, and synchronizes, to all network interfaces in the system. On the other hand, if a particular MAC address is aged out due to inactivity, a system component such as a Chassis Management Module (CMM) can inform other network interfaces about this event and this MAC address will be deleted from all other network interfaces as well. A CMM is a module that is responsible for operational state of a whole system component chassis (e.g., network interface card state, temperature, responding to user requests, etc. Therefore, at any given time, MAC address content from all ASICs will be uniformly the same. If frames ingress into a particular network interface and the destination is on other network interfaces, then these frames will be bridged/unicast out (e.g., via an interconnect mechanism such as HiGig).
A skilled person will appreciate that synchronizing newly learned MAC addresses over to all ASIC does help resolve the link aggregation flooding problems discussed above in reference to FIG. 2. For the traffic having frames are sent from MAC A to MAC B (i.e., traffic initiated from MAC A with a destination of MAC B), MAC A is learned on port locations 1, 2, and 3 due to flooding as a result of an unknown destination address of MAC B. Due to hashing, Path A is selected for this traffic flow. MAC B responses, but due to hashing, MAC B takes a different route (i.e., Path B) and will be learned on port locations 4, 5 and 6. Now traffic is unicasting in two different paths due to hashing. Prior to aging time interval expiration, each network interface (e.g., software thereof) reads out the L2 table on network interface 1 and synchronizes MAC A from network interface 1 to network interface 2 and network interface 2 synchronizes MAC B from network interface 2 to network interface 1. As long as traffic is sending bi-directionally between MAC A and MAC B, these two MAC addresses will be synchronized between the two network interfaces and traffic therebetween will be bridged/unicast back and forth.
Similarly, for the traffic having frames are sent from MAC B to MAC A (i.e., traffic initiated from MAC B with a destination of MAC A), MAC B is learned on port locations 4, 5, and 6 due to flooding as a result of an unknown destination address of MAC A. Path B is selected for this traffic flow. MAC A responses, MAC A is learned at port location 1, and immediately this MAC A is synchronized over to network interface 2 (i.e., is learned at port location 2). At the same time, because MAC B is learned on port location 6 on a link aggregation, Path A is chosen and traffic flow from MAC B to MAC A will be bridged/unicast out on Path A. Accordingly, traffic flow is unicasting in two different paths due to hashing. Prior to aging time interval expiration, each network interface (e.g., software thereof) reads out the L2 table on network interface 1 and synchronizes MAC A from network interface 1 to network interface 2 and network interface 2 synchronizes MAC B from network interface 2 to network interface 1. As long as traffic is sending bi-directionally between MAC A and MAC B, these two MAC addresses will be synchronized between the two network interfaces and traffic therebetween will be bridged/unicast back and forth.
Even though the synchronization scheme discussed above does help resolve link aggregation flooding problems, it is not without shortcomings. These shortcomings arise at least partially because all MAC addresses are synchronized between all network interfaces. One example of such shortcomings relates to scalability. MAC address capacity of a system/chassis will be the same as the individual ASIC capacity, regardless of the number of network interfaces in the system (e.g., a maximum of 32K MAC addresses when using Broadcom XGS family ASICs). Another example of such shortcomings relates to excess traffic which keeps all MAC addresses from each Network Interface card synchronized. There will be excess traffic generated on the interconnect scheme (e.g., HiGig) between network interfaces and this excess traffic will increase significantly with increase number of network interfaces. Furthermore, this excess traffic significantly limits the ability to support more advance system configurations such as, for example, virtual chassis configurations and multi-chassis configurations. Another example of such shortcomings relates to wasting L2 (Layer 2) TCAM (Ternary Content Addressable Memory) space. MAC addresses are added to the ASIC L2 TCAM and they are not inserted based on “need-to-know” basis. This will typically lead to wasting valuable resources of limited L2 TCAM space. Still another example of such shortcomings relates to an inability to mix ASICs. It will be expected that all ASICs of a network interface will have the same capacity. Therefore, network interfaces with various size ASICs cannot be mixed in a system. Yet another example of such shortcomings relates to complication to network interface software. Extra software will be required to perform synchronization at a timing basis.