In communication networks it is generally desirable to prevent service outages and/or loss of network traffic. By way of example, such service outages and/or loss of network traffic may occur when a network element fails, loses power, is taken offline, is rebooted, a communication link to the network element breaks, etc.
In order to help prevent such service outages and/or loss of network traffic, the communication networks may utilize interchassis redundancy (ICR). FIG. 1 is a block diagram of a prior art example of an interchassis redundancy (ICR) system 100. The ICR system includes a first network element 101-1 and a second network element 101-2. In the illustrated example, the first network element is an active network element, and the second network element is a standby network element. The first and second network elements each have a different physical chassis.
The first, active network element has a better Border Gateway Protocol (BGP) metric 104-1, which is better than a worse BGP metric 104-2 of the second, standby network element. The relative values of the better and worse BGP metrics are operable to cause network traffic exchanged with a plurality of other network elements 108 to be directed to the first, active network element, instead of to the second, standby network element. The network traffic is communicated over a plurality of network interfaces 107.
The first, active network element is operable to handle the network traffic and creates, maintains, and utilizes corresponding network traffic data 106-1 (e.g., session data). The first network element has a first ICR component 103-1, and the second network element has a second ICR component 103-2. The first and second network elements are coupled, or otherwise in communication, by a synchronization channel 102. The first and second ICR components are operable to cause the first and second network elements to exchange synchronization data 109 over the synchronization channel. The network traffic data 106-1 maintained by the first, active network element may be sent to the second, standby network element and preserved as replicated or redundant network traffic data 106-2. The synchronization data may represent a synchronization of stateful session data.
The first and second ICR components 103-1, 103-2 may also detect switchover events. By way of example, the ICR components may exchange messages over the synchronization channel in order to monitor the status of the other network element and control switchovers in response to switchover events. Examples of switchover events include, but are not limited to, the active network element or a critical portion thereof failing, the active network element or a critical portion thereof being taken offline (e.g., by a network operator in order to perform maintenance or an upgrade), the active network element rebooting, breaks in a communication link leading to the active network element, loss of power to the active network element, network operator induced switchovers (e.g., through a command-line interface (CLI) command), etc.
When a switchover event is detected, the ICR components 103-1, 103-2 may cause the current second, standby network element 101-2 to transition to the new active network element, and the current first, active network element 101-1 (when it is able to do so) to transition to the new standby network element. This may involve the ICR components toggling the relative magnitudes of the BGP metrics 104-1, 104-2. The new active network element may then virtually seamlessly begin handling the network traffic, for example by recreating subscriber sessions that were terminated on the former active network element, and utilizing the redundant set of network traffic data 106-2. When it is able, the new standby network element (in this case the first network element 101-1) may be synchronized to maintain a redundant set of the current network traffic data.
In some cases, the ICR system may be used to provide geographical redundancy of network traffic data. The first network element 101-1 may reside at a first geographical location 110-1 and the second network element 101-2 may reside at a second, different geographical location 110-2. The first and second different geographical locations may be remotely located from one another (e.g., locations at least several miles apart, different towns or cities, different states, different countries, etc.). Using ICR with geographical redundancy may help to reduce service outages and/or loss of network traffic in the event of a geographically localized disruption of service (e.g., due to catastrophic weather or another catastrophic event occurring at one geographical location). In the event of such a geographically localized disruption of service, network traffic handling may be switched to the other network element at the other geographical location, which generally should not be affected by the same geographically localized disruption of service.
Referring again to FIG. 1, a first set of virtual addresses 105-1, which is configured on the first, active network element, is used to access the network interfaces 107. A corresponding second set of virtual addresses 105-2 is configured on the second, standby network element. Traditionally, one or more network operator(s) manually configure the first and second sets of virtual addresses on the first and second network elements of an ICR system.