The present invention relates generally to network systems using redundant or standby devices working together in a redundancy group and load distributing arrangement to provide a virtual router service. More particularly, the present invention relates to methods and apparatus for controlling the distribution of traffic flow across a gateway using multiple gateway devices that are acting as a virtual router.
Local area networks (LANs) are commonly connected with one another through one or more routers so that a host (a PC or other arbitrary LAN entity) on one LAN can communicate with other hosts on different LANs (that is, remote or external networks). Typically, a host is able to communicate directly only with the entities on its local LAN segment. When it needs to send a data packet to an address that it does not recognize as being local, it communicates through a router (or other layer-3 or gateway device) which determines how to direct the packet between the host and the destination address in a remote network. Unfortunately, a router may, for a variety of reasons, become inoperative after a “trigger event” (for example, a power failure, rebooting, scheduled maintenance, etc.). Such potential router failure has led to the development and use of redundant systems, which have more than one gateway device to provide a back up in the event of primary gateway device failure. When a gateway device fails in such a redundancy system, the host communicating through the inoperative gateway device may still remain connected to other LANs by sending packets to and through another gateway device connected to the host's LAN.
Logically, such a system can resemble FIG. 1A. In FIG. 1A, a local network 130 uses a single gateway router 110 to forward outbound packets for hosts 122, 124, 126 when those packets are bound for an outside network 150 (for example, the Internet). As seen in FIG. 1B, however, the actual physical configuration of a redundancy group system uses several routers 112, 114, 116, 118 to implement a redundancy group that functions as the single virtual gateway 110 for hosts 122, 124, 126.
Various protocols have been devised to allow a host to choose a router from among a group of routers in a network. Two of these, Routing Information Protocol (or RIP) and ICMP Router Discovery Protocol (IRDP) are examples of protocols that involve dynamic participation by the host. However, because both RIP and IRDP require that the host be dynamically involved in the router selection, performance may be reduced and special host modifications and management may be required.
In a widely used and somewhat simpler approach, the host recognizes only a single “default” router. Hosts (for example, workstations, users and/or data center servers) using the IP protocol utilize this default gateway to exit a local network and access remote networks. Therefore, each host must have prior knowledge of the gateway's IP address which typically is a router or layer-3 switch IP address. Hosts are either statically configured with the IP address of the default gateway or are assigned the address through a configuration protocol (such as Cisco's DHCP) upon boot-up. In either case, the host uses the same default gateway IP address for all network traffic destined to exit the local network.
To forward traffic to the default gateway, the host must perform an IP-ARP resolution to learn the data-link Media Access Control (MAC) address of the default gateway. The host sends an ARP inquiry to the IP address of the gateway, requesting the gateway's MAC address. The default gateway will respond to the host's ARP request by notifying the host of the gateway's MAC address. The host needs the default gateway's MAC address to forward network traffic to the gateway via a data-link layer transfer. When only a single gateway device is used, that device returns its own “burned in” MAC address (BIA MAC address) as the address for the host's outbound packets.
In this approach, the host is configured to send data packets to the default router when it needs to send packets to addresses outside its own LAN. It does not keep track of available routers or make decisions to switch to different routers. This requires very little effort on the host's part, but has a serious danger. If the default router fails, the host cannot send packets outside of its LAN. This may be true even though there may be a redundant router able to take over, because the host does not know about the backup. Unfortunately, such systems have been used in mission critical applications.
The shortcomings of these early systems led to the development and implementation of redundant gateway systems, which provide for failover in gateway settings. One such system is the hot standby router protocol (HSRP) by Cisco Systems, Inc. of San Jose, Calif. A more detailed discussion of the earlier systems and of an HSRP type of system can be found in U.S. Pat. No. 5,473,599 (referred to herein as “the '599 patent”), entitled STANDBY ROUTER PROTOCOL, issued Dec. 5, 1995 to Cisco Systems, Inc., which is incorporated herein by reference in its entirety for all purposes. Also, HSRP is described in detail in RFC 2281, entitled “Cisco Hot Standby Router Protocol (HSRP)”, by T. Li, B. Cole, P. Morton and D. Li, which is incorporated herein by reference in its entirety for all purposes.
HSRP is widely used to back up primary routers for a network segment. In HSRP, a “standby” router is designated as the backup to an “active” router. The standby router is linked to the network segment(s) serviced by the active router. The active and standby routers share a single “virtual IP address” and, possibly, a single “virtual Media Access Control (MAC) address” which is actually in use by only one router at a time. All internet communication from the relevant local network employs the virtual IP address (also referred to as a “vIP address”) and the virtual MAC address (also referred to herein as a “vMAC address”). At any given time, the active router is the only router using the virtual address(es). Then, if the active router should cease operation for any reason, the standby router immediately takes over the failed router's load (by adopting the virtual addresses), allowing hosts to always direct data packets to an operational router without monitoring the routers of the network.
One drawback to HSRP systems in general is that only one gateway device in a redundancy group is in use at any given time. To better utilize system resources in such redundancy systems, a gateway load balancing protocol (GLBP) was developed by Cisco and is the subject of commonly owned and copending U.S. Ser. No. 09/883,674 filed Jun. 18, 2001, entitled GATEWAY LOAD BALANCING PROTOCOL, which is incorporated herein by reference in its entirety for all purposes.
It should be noted here that the term “gateway load balancing protocol” is somewhat of a misnomer (or at least is not as precise as it might be). While the members of a redundancy group share the traffic flow, there has been no “balancing” of the traffic loads, per se, across the gateway. It is true that sharing the traffic load among members of a redundancy group means that responsibility for all traffic is not borne by a single gateway device. However, the terms “load sharing” and “load distribution” more accurately describe the actual implementations of these earlier systems. Therefore, the terms “load sharing” and “load distribution” and the like herein mean the ability to assign outgoing traffic to multiple gateway devices so that a single gateway device is not responsible for all outbound packets from all hosts on a LAN. (For the sake of reference to previously filed patent applications and other publications relied upon herein, the acronym GLBP will still be used herein to refer to the earlier, basic underlying load sharing protocol developed by Cisco Systems.)
Like HSRP, for communications directed outside of a LAN, GLBP uses a single vIP address shared by multiple redundancy group gateway devices (for example, routers), which also maintain actual IP addresses as well (also referred to as “aIP addresses”). Each gateway device also has its own BIA (actual) MAC address (also referred to herein as an “aMAC address”) and a single virtual MAC address. Use of vMAC addresses allows interchangeability of routers without the need for reprogramming of the system.
Each GLBP system has a “master” gateway device (also referred to herein as an “Active Virtual Gateway” or AVG device) in the redundancy group that controls address assignment (ARP responses) and failover features. The AVG instructs an ARPing host to address outgoing communications to a virtual MAC address assigned to one of the redundancy group gateway devices (gateway devices not functioning as a master device may be referred to as “standby” and/or “slave” gateway devices, in accordance with standard GLBP nomenclature and operation). Any gateway device that is forwarding packets is referred to herein as an “Active Virtual Forwarder” or AVF device. Each redundancy group therefore has one AVG device and one or more AVF devices.
More specifically, a host sends an ARP message to the redundancy group's virtual IP address when the host wants to send a packet outside the local network. The AVG selects an AVF to handle outgoing packets for the host and sends the host a reply message containing the vMAC of the AVF selected by the AVG. The host populates its ARP cache with this vMAC address. Thereafter, the host addresses its outbound packets to the vMAC address in its ARP cache, thus sending these packets to the assigned AVF/router.
In earlier systems, hosts were assigned vMAC addresses by random assignment, round robin assignment or by using another prescribed algorithm or methodology. In the event that an assigned AVF of the group failed, the outgoing communications that were to be handled by the failed AVF had to be sent elsewhere. Upon failure of the originally assigned AVF, the failed AVF's vMAC address was re-assigned to another router, for example another router that is acting as an AVF. Thereafter, outgoing packets from the host (and any other host(s) which send packets to the re-assigned vMAC address) were routed instead to the new owner of that newly re-assigned vMAC address. In the event that the AVG itself failed, additional steps were taken to appoint or elect a new AVG and ensure continuity in the load distribution function. However, if one or more gateway devices took on an inordinate portion of the traffic load, there was no way to balance this load sharing capability to control distribution (evenly or otherwise) the traffic flow through gateway devices at the gateway.
In view of the foregoing, it would be desirable to provide gateway load balancing services for communications from outside a local network while ensuring that redundant, load sharing gateway services are still available for the local network.