Large scale distributed networking systems, such as those used in data centers or large scale enterprise networks, are often designed as high availability (HA) systems to provide redundancy. Some HA systems are configured in an active/passive model, which requires a fully redundant, passive instance as a backup for each primary, active node. Such systems typically require extra hardware and tend to be more costly to build out. Some HA systems are configured in an active/active model, where both the primary and secondary nodes handle traffic under normal conditions, and in the event that the primary node fails, a secondary node takes over the role of the primary node.
Existing distributed networking systems with active-active HA configuration typically require an additional node (e.g., a controller node) to monitor the health of the primary node. In the event that the controller detects that the primary node has failed, the controller will re-configure the secondary node as a new primary node. In practice, however, controllers are often not co-located with the nodes. In the event that the controller has failed or is unable to communicate with the primary and/or secondary nodes, the reconfiguration of the secondary node would not occur, thus preventing the failover from taking place. A more reliable technique for providing active-active HA for a distributed networking system is therefore needed.