1. Technical Field
The present disclosure relates to providing redundancy of routers in a network environment. More specifically, the present disclosure relates to a method, apparatus, and computer program product for performing zero-down-time failover in a network system of a redundancy group comprising redundant or standby apparatuses operable together.
2. Description of the Related Art
Local area networks (LANs) are usually connected to other (LANs) or extranets via one or more routers or Internet appliances which enable intercommunication between hosts (i.e., client end computers disposed on a LAN, such as a personal computer) on different LANs, since a host can only communicate directly with the network entities on a segment of the LAN of the host.
Unlike general-purpose computer devices, an internet appliance is typically designed to serve a specific purpose and/or provide a specific service. Compared to a typical general-purpose computer device, internet appliances are relatively “closed”—their specific operating systems and applications (or drivers) vary with their intended purposes and services. For example, please refer to IBM® WebSphere® DataPower Series SOA Appliances or Tivoli® ISS Appliances® (“IBM,” “WebSphere,” and “Tivoli” are registered trademarks in the possession of International Business Machine in the United States and/or other countries). In addition to having routing functions, an information appliance may also provide network-attached storage.
Unfortunately, failure of a router, including rebooting and scheduled maintenance, is likely to paralyze a network in its entirety. In response, a number of different First Hop Redundancy Protocols (FHRP) capable of performing failover have been developed to cope with router failure and to minimize down time before networking functionality is restored. For example, Cisco Systems, Inc. offers Hot Standby Router Protocol (HSRP), Virtual Router Redundancy Protocol (VRRP), and Gateway Load Balancing Protocol (GLBP). Hot Standby Router Protocol (HSRP) is dedicated to support a redundancy protocol of the system undergoing failover, and is described in detail in RFC 2281 (see also U.S. Pat. Nos. 5,473,599 and 7,152,179). The HSRP dedicated protocol enables a network engineer to position a plurality of redundant routers in the same subnet in a manner that each of the redundant routers functions as a subnet router (gateway). To use the HSRP, it is necessary that routers (gateways) are arranged together to form a virtual network entity (virtual router), such that a virtual IP address and a virtual MAC address of the HSRP are created for use by the virtual router. The routers are hereunder collectively known as a “redundancy group or standby group”. As different routers are configured in accordance with the HSRP, a single primary active router is adapted to serve a communication purpose and selected. The active router maps itself to the virtual router and fully represents the virtual router in processing traffic flow. Additionally, a single standby router is also selected based on pre-configured priority or any other appropriate rules. When configured in accordance with the HSRP, a passive router is linked to a segment or to segments served by an active router and is designated to function as a backup apparatus for the active router. Hence, the active router and the passive router share the virtual IP address and the virtual MAC address (each instance is restricted to a router).
The prior art embodiment depicted in FIG. 1 comprises only one host 100, a switch 110, and a redundant or standby group 120 of an active router 122 and a passive router 124. The passive router 124 within the same standby group 120, periodically receives a “Hello” message 130 sent from active router 122, so as to test whether the active router 122 has failed. However, the switch 110 of each of the failover mechanisms provided by the various First Hop Redundancy Protocols (FHRP) known in the prior art, in response to receiving an incoming message, forwards the message to the active router 122 currently representing the virtual router. If the active router 122 fails or is otherwise “down”, there is a significant time lag before the passive router replaces the failed active router. The time lag lasts for seconds and is known as a black hole. The black hole starts from the point in time when the active router fails and ends at the point in time when the passive router detects a failed active router. As a result, many messages are missed out during the black hole which is likely to greatly impact mission-critical information appliances or applications (such as stock trading.)