Field
Embodiments of the present disclosure generally relate to failover operations in networks. More particularly, the present disclosure relates to systems and methods for standby controller assisted failover.
Description of the Related Art
In a typical enterprise setup, there may be several wireless access points (APs) installed within an enterprise network to provide access to information/data to devices connected thereto from within the enterprise network or from outside the enterprise network. To manage these APs and/or to grant access to devices connecting through the APs, a centralized network controller is typically configured. APs generally connect to the centralized network controller to authenticate client devices attempting to connect to the enterprise network via the APs. In most cases, an AP controller (also referred to hereafter simply as a controller) has a processor, a memory, and other resources required to interpret, forward, and process messages and initiate other messages as appropriate. In order to improve reliability of service, systems are provided for high availability devices that serve APs or other network elements such as routers, proxies, firewalls, gateways, switches, among other like devices without fail, for instance, by means of an active controller and a standby controller, where the standby controller takes over for the active controller when the active controller experiences a failure or the communication link between the active controller and the AP is down. For instance, there may be scenarios where the connection between an AP and a controller, which may be represented by software implemented within a network gateway or firewall device, may go down or the active controller may stop functioning, in which case, network security may be compromised or in-process transactions may be delayed or dropped unless or until the connection is re-established between the centralized controller and the AP.
In order to ensure high availability of servers, systems and other hardware or network components, a standby system is typically provided to takeover in case of a failure of the primary system. Such automatic switching over to a redundant or standby system from an active system upon the failure or abnormal termination of the previously active system is commonly referred to as failover. In the context of wireless networks, existing failover mechanisms rely on a heartbeat system involving the exchange of messages at a periodic interval (e.g., 30 seconds) between an active controller and the managed devices (e.g., one or more APs). For example, an AP or other managed network element may send a heartbeat message, also referred to as a keep-alive message, to the active controller and then wait for the response. If the network element receives a response to the keep-alive request from the primary controller within a predefined time limit, the AP can assume that the primary controller is working properly and may continue to forward requests to it. If the AP doesn't receive a response to the keep-alive request message within the predefined time, it assumes that the primary controller is not operational and begins sending all subsequent requests to the standby controller. Keep-alive messages may also serve the purpose of checking other health parameters of the active controller. In general, a keep-alive signal/message is often sent at predefined intervals, and the timing of same plays an important role in checking the connection between two network entities. After a signal is sent, if no reply is received from the other end, it can be assumed that the connection is down or that the primary controller has experienced a failure, and subsequent requests or data should be routed via another path or to an alternate resource (e.g., the standby controller).
In a typical enterprise network, there may be several APs sending such keep-alive messages to detect the operational status of the active controller. The overhead related to these keep-alive messages prevent the interval between keep-alive messages from being too short. While, ideally, a very short time interval would be preferred for optimal failover detection, a typical keep-alive message interval is on the order of 30 seconds to avoid overburdening the network. As such, it may take 90 seconds or more for APs to determine the failure condition and begin failing over to the standby controller.
There is therefore a need in the art for systems and methods that can enable efficient and fast failover from an active controller to a standby controller by reducing the time to detect a link failure or a failure of a primary/active controller.