1. Field of the Invention
The present invention relates to computer networks, and more particularly to managing multi-homed tunnels between virtual private network (VPN) clients.
2. Background Information
A computer network is a geographically distributed collection of nodes interconnected by communication links and segments for transporting data between end nodes, such as personal computers and workstations (“hosts”). Many types of networks are available, with the types ranging from local area networks (LANs) to wide area networks (WANs). LANs typically connect the nodes over dedicated private communications links located in the same general physical location, such as a building or campus. WANs, on the other hand, typically connect geographically dispersed nodes over long-distance communications links, such as common carrier telephone lines, optical lightpaths, synchronous optical networks (SONET), or synchronous digital hierarchy (SDH) links. The Internet is an example of a WAN that connects disparate networks throughout the world, providing global communication between nodes on various networks. The nodes typically communicate over the network by exchanging discrete frames or packets of data according to predefined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP). In this context, a protocol consists of a set of rules defining how the nodes interact with each other. Computer networks may be further interconnected by an intermediate network node, such as a router, to extend the effective “size” of each network.
Since management of interconnected computer networks can prove burdensome, smaller groups of computer networks may be maintained as routing domains or autonomous systems. The networks within an autonomous system (AS) are typically coupled together by conventional “intradomain” routers configured to execute intradomain routing protocols, and are generally subject to a common authority. To improve routing scalability, a service provider (e.g., an ISP) may divide an AS into multiple “areas.” It may be desirable, however, to increase the number of nodes capable of exchanging data; in this case, interdomain routers executing interdomain routing protocols are used to interconnect nodes of the various ASes. Moreover, it may be desirable to interconnect various ASes that operate under different administrative domains. As used herein, an AS or an area is generally referred to as a “domain,” and a node that interconnects different domains together is generally referred to as a “border node or border router.” In general, the autonomous system may be an enterprise network, a service provider or any other network or subnetwork. Furthermore, the autonomous system may be multi-homed, i.e., comprising a plurality of different peer (neighboring) connections to one or more other routing domains or autonomous systems.
The administrative entity of an AS typically configures network nodes within the AS to route packets using predetermined intradomain routing protocols, or interior gateway protocols (IGPs), such as conventional link-state protocols and distance-vector protocols. These IGPs define the manner with which routing information and network-topology information is exchanged and processed in the AS. Examples of link-state and distance-vectors protocols known in the art are described in Sections 12.1-12.3 of the reference book entitled Interconnections, Second Edition, by Radia Perlman, published January 2000, which is hereby incorporated by reference as though fully set forth herein.
Link-state protocols, such as the Open Shortest Path First (OSPF) protocol, use cost-based routing metrics to determine how data packets are routed in an AS. As understood in the art, a relative cost value may be associated with a network node to determine the relative ease/burden of communicating with that node. For instance, the cost value may be measured in terms of the average time for a data packet to reach the node, the amount of available bandwidth over a communication link coupled to the node, etc. Network nodes in the AS generate a set of cost values associated with their neighboring nodes. Each set of cost values is then “advertised” (flooded) to the other interconnected nodes. Using the advertised cost values, each node can generate a consistent “view” of the network topology, thereby enabling the nodes to determine lowest-cost routes within the AS.
Distance-vector protocols, such as the Interior Gateway Routing Protocol (IGRP) or Routing Information Protocol (RIP), use distance-based routing metrics to determine how data packets are routed in an AS. A network node may associate a distance metric with each of its interconnected nodes in the AS. For example, the distance metric may be based on, e.g., a number of hops between a pair of nodes or an actual distance separating the nodes. Operationally, the network nodes determine distances to reachable nodes in the AS and communicate these distance metrics to their neighboring nodes. Each neighboring node augments the received set of distance metrics with its own distance measurements and forwards the augmented set of metrics to its neighbors. This process is continued until each node receives a consistent view of the network topology.
An intermediate network node often stores its routing information in a routing table maintained and managed by a routing information base (RIB). The routing table is a searchable data structure in which network addresses are mapped to their associated routing information. However, those skilled in the art will understand that the routing table need not be organized as a table, and alternatively may be another type of searchable data structure. Although the intermediate network node's routing table may be configured with a predetermined set of routing information, the node also may dynamically acquire (“learn”) network routing information as it sends and receives data packets. When a packet is received at the intermediate network node, the packet's destination address may be used to identify a routing table entry containing routing information associated with the received packet. Among other things, the packet's routing information indicates the packet's next-hop address.
A plurality of interconnected ASes may be configured to exchange routing and reachability information among neighboring interdomain routers of the systems in accordance with a predetermined external gateway protocol, such as the Border Gateway Protocol (BGP). The BGP protocol is well known and generally described in Request for Comments (RFC) 1771, entitled A Border Gateway Protocol 4 (BGP-4), published March 1995, which is hereby incorporated by reference in its entirety. An adjacency is a relationship formed between selected neighboring (peer) routers for the purpose of exchanging routing information messages and abstracting the network topology. The routing information exchanged by BGP peer routers typically includes destination address prefixes, i.e., the portions of destination addresses used by the routing protocol to render routing (“next hop”) decisions. Examples of such destination addresses include IP version 4 (IPv4) and version 6 (IPv6) addresses. BGP generally operates over a reliable transport protocol, such as TCP, to establish a TCP connection/session. To implement the BGP protocol, each AS includes at least one border node through which it communicates with other, interconnected ASes. Because data packets enter and exit the AS through the border node, the border node is said to be located at the “edge” of the AS.
The Enterprise Class Teleworker (ECT) Solution
There are circumstances where an employee (user) may be required to (or desire to) use a company's computer resources outside of the company's main office. “Teleworking” may be used to extend a company's network infrastructure to reach remote and home-based workforces, enhancing employee productivity, satisfaction, and retention. As used herein, teleworking users (teleworkers) include, e.g., mobile/remote employees (employees out of the office for most of their work hours who conduct most of their business at customer locations or while traveling), full-time teleworkers (employees who work from fixed external sites, most often their home), part-time teleworkers (employees who telecommute a few days per week or part-time employees who work from home), day extenders (employees who telecommute primarily in the evenings or on weekends to stretch their workdays), and others (e.g., part-time teleworkers, including consultants who telecommute because of a specific project or event). These teleworkers often require the same environment at home and at any work location, including being able to use computer applications, such as, e.g., licensed software, web conferencing, instant messaging, virtual classrooms, etc., and being able to “carry” their work phone numbers with them. As examples, the ability to access computer (business) applications from home has tremendous application for call center operations, while customer support engineers “on call” have the option of quickly accessing all information directly from their home just as if they were in the office. Also, the teleworkers need to be reachable on their work phone numbers directly (instead of going into voice mail) and be able to make long distance phone calls at corporate rates.
Traditionally, remote-access clients have utilized a virtual private network (VPN) architecture in order to access company resources outside of the main office. A VPN is a collection of network nodes that establish private communications over a shared backbone network. Previously, VPNs were implemented by embedding private leased lines in the shared network. The leased lines (i.e., communication links) were reserved only for network traffic among those network nodes participating in the VPN. Today, the above-described VPN implementation has been mostly replaced by private “virtual circuits” deployed in public networks. Specifically, each virtual circuit defines a logical end-to-end data path between a pair of network nodes participating in the VPN. When the pair of nodes is located in different routing domains, edge devices in a plurality of interconnected routing domains may have to cooperate to establish the nodes' virtual circuit. Notably, a virtual circuit may be established using, for example, conventional layer-2 Frame Relay (FR) or Asynchronous Transfer Mode (ATM) networks. Alternatively, the virtual circuit may “tunnel” data between its logical end points using known layer-2 and/or layer-3 tunneling protocols, such as the Layer-2 Tunneling Protocol (L2TP) and the Generic Routing Encapsulation (GRE) protocol. In this case, one or more tunnel headers are prepended to a data packet to appropriately route the packet along the virtual circuit. The Multi-Protocol Label Switching (MPLS) protocol may be used as a tunneling mechanism for establishing layer-2 virtual circuits or layer-3 network-based VPNs through an IP network.
The Enterprise Class Teleworker (ECT) solution is a type of remote-access VPN solution that combines security, authentication, management (e.g., a “zero touch deployment” where control remains with the enterprise), and quality of service in order to create a truly business-ready teleworker environment, with access to all of the advanced computer and phone (e.g., IP phone) capabilities of the main office. The ECT solution may be used by any type of teleworker that desires substantially constant connectivity to the main office, including telecommuters, Small Office Home Office (SOHO) users, and remote sites or branches. ECT uses Dynamic Multipoint VPN (DMVPN) technology to allow users to better scale large and small IP Security (IPsec) VPNs by combining GRE tunnels, IPsec encryption, and the Next Hop Resolution Protocol (NHRP). This combination creates the ability to dynamically add clients and tunnels without requiring complicated configurations (e.g., crypto-maps) on the server or other clients. Notably, a multipoint GRE (mGRE) tunnel interface may be used to allow a single GRE interface to support multiple IPsec tunnels and simplifies the size and complexity of the configuration.
The ECT solution and supporting technologies are described in the following documents and presentations (available at www.cisco.com/go/ect at the time of filing), the contents of which are hereby incorporated by reference in their entirety:                Enterprise Class Teleworker Deployment Guide, March 2005;        Enterprise Class Teleworker Solution, May 2005;        Enterprise Class Teleworker Management Solution, March 2005;        Enterprise Class VPNs, November 2004;        Cisco IOS VPN Enterprise Class Teleworker Solution, 2004;        Enterprise Class Teleworker Deployment using ISC and EZSDD, 2001;        Deployment of Secure Sockets Layer VPNs, May 2005;        Cisco IOS IPsec High Availability, April 2005;        Secure Voice and Wireless in a VPN Deployment, April 2005; and        Layered Security in a VPN Deployment, March 2005.        
Further, DMVPN technology is described in the following documents (also available at www.cisco.com/go/ect at the time of filing), the contents of which are hereby incorporated by reference in their entirety:                Dynamic Multipoint VPN (DMVPN), June 2005;        Dynamic Multipoint IPsec VPNs (Using Multipoint GRE/NHRP to Scale IPsec VPNs), August 2005;        Integrated Easy VPN and Dynamic Multipoint VPN, March 2005; and        Dynamic Multipoint VPN Deployment on Cisco Catalyst 6500 Switches—MWAM & Native Modes, May 2005.        
Typically, the ECT solution is embodied as a “hub and spoke” architecture, as will be understood by those skilled in the art. Typically, one router in the hub and spoke architecture is designated as the hub, and all the other routers (spokes) are configured with tunnels to the hub. For example, each client, or spoke, maintains a substantially constant connection with the enterprise network/server, or hub. Specific to the ECT solution, a spoke router generally maintains at least two VPN connections to the corporate network. The first connection is called the management tunnel and is used exclusively for managing the network. The management network hosts all the servers and tools needed for maintaining the network (e.g., an authentication, authorization, and accounting, or AAA server, certificate server, provisioning/management tools, etc.). The second connection carries the data traffic to the corporate network, and is hereinafter referred to as the data tunnel.
The spoke-to-hub data tunnels are continuously operational, and most of the traffic within the hub-and-spoke architecture is between the spoke and the hub. Commonly, the spokes may have a connection (e.g., through the Internet) to the other spokes of the network. In the ECT solution (e.g., using DMVPN), spokes do not need static configuration for direct tunnels to any of the other spokes. Instead, when a spoke wants to transmit a packet to another spoke network (such as the subnet behind another spoke network), it dynamically determines the required destination address of the target spoke network (e.g., by a lookup operation at the hub), and establishes a dynamic spoke-to-spoke tunnel. These spoke-to-spoke tunnels are established on demand whenever there is traffic between the spokes, and packets between spokes may thereafter bypass the hub. Notably, spoke-to-spoke tunnels reduce traffic traversing the hub thereby freeing hub resources needed to, e.g., decrypt and re-encrypt traffic between spokes, while also reducing the total amount of network-wide consumed bandwidth, e.g., especially where the spokes are more closely situated to each other than to the hub. Moreover, as those skilled in the art will understand, these dynamic spoke-to-spoke tunnels are beneficial over a conventional full or partial mesh network, where continuous point-to-point tunnels (e.g., IPsec or IPsec+GRE tunnels) must be configured on all the routers, even if some/most of these tunnels are not running or needed at all times.
Often, the quality of a client's connection away from the office (e.g., a consumer broadband connection to a home) is not consistent, and when business depends upon uninterrupted access to the Internet, this inconsistency may cause problems for the client. For teleworkers, e.g., using the ECT solution, a link failure will mean a loss of connection with the office (enterprise network), and subsequent loss of productivity. Due to the increased availability of network connections (and decreased cost of those connections), many clients use a multi-homed network to increase their connectivity (network “uptime”) and uninterrupted access to the enterprise network.
As used herein, a multi-homed network is any network or subnetwork that is directly connected to more than one adjacent network or subnetwork. For instance, a client (or network) may be multi-homed to primary and secondary ISPs. Both the primary and secondary ISPs provide access to an Internet “backbone,” i.e., a high-bandwidth, wide-area network that is configured to transport data between remote networks and subnetworks. In this arrangement, the primary ISP functions as the preferred service provider for the customer site, and the secondary ISP functions as a backup service provider. That is, incoming and outgoing network traffic between the customer site and the Internet backbone is preferably routed through the primary ISP. The secondary ISP provides the customer site with access to the Internet backbone in the event that the primary ISP fails, e.g., due to the primary ISP losing connectivity with the Internet backbone and/or the customer site. In response to such a failure, the secondary ISP then becomes the customer site's preferred path for incoming and outgoing network traffic.
A solution for dynamically utilizing the multi-homed spoke-to-hub tunnels is described in above-incorporated U.S. patent application Ser. No. 11/229,421, entitled TECHNIQUE FOR USING OER WITH AN ECT SOLUTION FOR MULTI-HOMED SITES. Currently, however, the ECT solution is configured to dynamically create only one spoke-to-spoke tunnel when it is needed, regardless of the multi-homed capabilities of the spokes. Remote offices and teleworkers often experience poor VPN connections to peer spoke networks due to packet loss, brownouts, and/or heavy congestion over the single tunnel, and currently have no secondary or backup tunnel. There remains a need, therefore, for an improved ECT solution that dynamically creates and utilizes multi-homed spoke-to-spoke tunnels.