Networks and Subnetworks
A computer network is a geographically distributed collection of interconnected subnetworks, such as local area networks (LAN), that transport data between network nodes. As used herein, a network node is any device adapted to send and/or receive data in the computer network. The network topology is defined by an arrangement of network nodes that communicate with one another, typically through one or more intermediate network nodes, such as routers and switches. In addition to intra-network communications between nodes located in the same network, data also may be exchanged between nodes located in different networks. To that end, a “border router” located at the logical outer-bound (or “edge”) of a first computer network may be adapted to send and receive data with a border router situated at the edge of a neighboring (i.e., adjacent) network. Inter-network and intra-network communications are typically effected by exchanging discrete packets of data according to predefined protocols. In this context, a protocol consists of a set of rules defining how network nodes interact with each other.
A data packet may originate at a source node and subsequently “hop” from node to node along a logical data path until it reaches its destination. The network addresses defining the logical data path of a data flow are most often stored as Internet Protocol (IP) addresses in the packet's internetwork (layer 3) header. IP addresses are typically formatted in accordance with the IP Version 4 (IPv4) protocol, in which network nodes are addressed using 32 bit (four byte) values. The IPv4 addresses are typically denoted by four numbers between 0 and 255, each number delineated by a “dot.” Although IPv4 is prevalent in most networks today, IP Version 6 (IPv6) has been introduced to increase the length of an IP address to 128 bits (16 bytes), thereby increasing the number of available IP addresses. For purposes of discussion, IP addresses will be represented as IPv4 addresses hereinafter, although those skilled in the art will appreciate that IPv6 or other layer-3 address formats alternatively may be used in the illustrative embodiments described herein.
A subnetwork may be assigned to an IP address space containing a predetermined range of IPv4 addresses. For example, an exemplary subnetwork may be allocated the address space 128.0.10.*, where the asterisk is a wildcard that can differentiate up to 254 individual nodes in the subnetwork (0 and 255 are reserved values). In this case, a first node in the subnetwork may be assigned to the IP address 128.0.10.1, whereas a second node may be assigned to the IP address 128.0.10.2. The subnetwork is often associated with a subnet mask that may be used to select a set of contiguous high-order bits from IP addresses within the subnetwork's allotted address space. A subnet mask length indicates the number of contiguous high-order bits selected by the subnet mask, and a subnet mask length of N bits is hereinafter represented as /N. The subnet mask length for a given subnetwork is typically selected based on the number of bits required to distinctly address nodes in that subnetwork. Subnet masks and their uses are more generally described in Chapter 9 of the reference book entitled Interconnections Second Edition, by Radia Perlman, published January 2000, which is hereby incorporated by reference as though fully set forth herein.
As used herein, an “address prefix” is defined as the result of applying a subnet mask to a network address. An address prefix therefore specifies a range of network addresses in a subnetwork, and a /32 address prefix corresponds to a particular network address. For example, consider the address prefix 10.1.1.4/30. The first 30 bits of this prefix uniquely identifies the subnetwork 10.1.1.4, and the remaining two least-significant bits of the prefix may be used to differentiate up to four different network nodes in the subnetwork. Accordingly, the prefix 10.1.1.4/30 includes the IP addresses 10.1.1.4, 10.1.1.5, 10.1.1.6 and 10.1.1.7. A “route” is defined herein as an address prefix and its associated path attributes. A path attribute is generally any property or characteristic that may be associated with the prefix, e.g., such as a cost metric, bandwidth constraint, next-hop identifier and so forth.
Two or more routes may be aggregated if (1) they are associated with a common set of path attributes and (2) their prefixes correspond to contiguous ranges of network addresses or one of the prefix's range of addresses is a superset of the other prefixes'. For example, assume that the routes 128.52.10.0/24 and 128.52.10.5/30 are associated with the same path attributes. Since the route 128.52.10.0/24 includes every IP address within the route 128.52.10.5/30, the two routes may be aggregated as 128.52.10.0/24. By way of further example, the routes 128.52.10.0/25 and 128.52.10.128/25 respectively specify the contiguous ranges of IP addresses 128.52.10.0-127 and 128.52.10.128-255. Accordingly, these two routes may be aggregated as 128.52.10.0/24, which contains both routes' IP address ranges.
Border Gateway Protocol
Border routers located at the logical edge of a network or subnetwork may be configured to exchange data with border routers in adjacent networks or subnetworks. The border routers typically execute inter-domain routing protocols (or “exterior” gateway routing protocols) to exchange routing and reachability information across network boundaries. An example of a common inter-domain routing protocol is the Border Gateway Protocol (BGP). The BGP protocol is well known and described in detail in Request For Comments (RFC) 1771 by Y. Rekhter and T. Li, entitled A Border Gateway Protocol 4 (BGP-4), dated March 1995, which is hereby incorporated by reference as though fully set forth herein. A variation of the BGP protocol, known as internal BGP (iBGP), is often used to distribute routing and reachability information between border routers located within the same network or subnetwork. To implement iBGP, the border routers must be “fully meshed,” such that each border router is coupled to every other border router, e.g., by way of a Transmission Control Protocol (TCP) connection.
BGP-enabled border routers perform various routing functions, including transmitting and receiving BGP messages and rendering routing decisions based on BGP routing policies. Each border router maintains a local BGP routing table that lists feasible routes to reachable (i.e., accessible) network nodes and subnetworks. Periodic refreshing of the BGP routing table is generally not performed. However, the BGP-enabled border routers do exchange routing information under certain circumstances. For example, when a BGP router initially connects to the network, the router receives the entire contents of the BGP routing tables of its peers, i.e., its adjacent border routers. Thereafter, when the contents of a border router's BGP table changes, the router transmits only the changed portions of its BGP table to its peers which, in turn, update their local BGP tables. A BGP update message is thus an incremental update message sent in response to changes to the contents of the BGP routing table. Routing updates provided by the BGP update messages allow a set of interconnected border routers to construct a consistent view of the network topology. BGP update messages are typically sent using a reliable transport protocol, such as TCP, to ensure their reliable delivery.
Each BGP update message includes network layer reachability information (NLRI) that specifies a list of address prefixes whose reachability information has changed. The BGP update message also may include one or more BGP attributes that are associated with the NLRI address prefixes. For instance, the update message may include a “Next Hop” attribute to indicate which border router should be used as the next hop to reach the address prefixes listed in the NLRI. Conventional BGP attributes and their formats are generally well known and are described in more detail in Chapter 6 of the reference book entitled IP Switching and Routing Essentials, by Stephen A. Thomas, published 2002 which is hereby incorporated by reference in its entirety. Together, the NLRI prefixes and their associated BGP attributes comprise a set of BGP routes whose reachability information has changed.
BGP update messages may include one or more BGP community attributes or extended community attributes. As defined in RFC 1997, entitled BGP Communities Attribute, by R. Chandra et al., published August 1996, which is hereby incorporated by reference in its entirety, a BGP community is a group of destinations which share a common property. By default, all routes belong to an Internet community. In addition, RFC 1997 also defines other types of BGP communities, such as the “no_export” and “no_advertise” communities. The no_export community identifies a set of routes that may be advertised only within a single network or subnetwork and are not permitted to be advertised outside of that network or subnetwork. The no_advertise community is associated with routes that should not be advertised at all.
BGP extended community attributes provide added flexibility over existing BGP community attributes. In particular, BGP extended communities typically include a “type” field that may be used to differentiate additional types of BGP communities beyond those already supported by the conventional BGP community attribute. The “IPv4-address-specific” extended community attribute is one example of a BGP extended community attribute. Specifically, the IPv4-address-specific extended community attribute comprises a type field, a subtype field, a global administrator field and a local administrator field, as described in more detail in the Internet Engineering Task Force (IETF) publication “draft-ietf-idr-bgp-ext-communities-07.txt,” entitled BGP Extended Communities Attribute, by Sangli et al., published September 2004, which is hereby incorporated by reference as though fully set forth herein.
Route Aggregation in Multi-Homed Networks
As used herein, a “multi-homed” network is any network or subnetwork that is directly connected to more than one adjacent network or subnetwork. For instance, a customer site (network) may be multi-homed to primary and secondary Internet service providers (ISP). Both the primary and secondary ISPs provide access to an Internet “backbone,” i.e., a high-bandwidth, wide-area network that is configured to transport data between remote networks and subnetworks. In this arrangement, the primary ISP functions as the preferred service provider for the customer site, and the secondary ISP functions as a backup service provider. That is, incoming and outgoing network traffic between the customer site and the Internet backbone is preferably routed through the primary ISP. The secondary ISP provides the customer site with access to the Internet backbone in the event that the primary ISP fails, e.g., due to the primary ISP losing connectivity with the Internet backbone and/or the customer site. In response to such a failure, the secondary ISP then becomes the customer site's preferred path for incoming and outgoing network traffic.
FIG. 1 illustrates an exemplary multi-homed computer network 100 in which route aggregation may be employed. The network 100 includes a backbone network 110 that is coupled to both a primary ISP 120 and a secondary ISP 130, which in turn are both coupled to a multi-homed customer site 140. The primary ISP also may be coupled to other customers sites, such as customer sites 150 and 160. The primary ISP may allocate a block of IP addresses for each of its neighboring customer sites. As shown, the primary ISP allocates IP addresses in the range 10.1.1.0/24 to the customer site 140, 10.1.2.0/24 to the customer site 150 and 10.1.3.0/24 to the customer site 160.
Although the primary ISP allocates different IP address ranges for each of its neighboring customer sites, the primary ISP may aggregate these allocated IP address ranges as a single aggregated prefix. For instance, in this example, the primary ISP aggregates the “more specific” (i.e., having longer subnet mask lengths) IP address ranges 10.1.1.0/24, 10.1.2.0/24 and 10.1.3.0/24 as a single aggregated prefix 10.1.0.0/16. By aggregating the prefixes in this manner, the primary ISP may advertise a single aggregated route to the backbone network 110, rather than advertising a separate route for each customer site 140-160. In this way, the primary ISP notifies network nodes in the backbone network that any IP address in the aggregated range 10.1.0.0/16 can be reached through the primary ISP. Accordingly, the primary ISP advertises fewer routes to the backbone network 110, thereby reducing the number of routes that network nodes in the backbone network have to store in their BGP tables. As a result, the network nodes in the backbone network can search fewer BGP routes in their table and thus perform faster packet-forwarding operations.
After the multi-homed customer site 140 receives its allocated block of IP addresses from the primary ISP 120, the customer site advertises its allocated IP addresses to the secondary ISP 130. For instance, the customer site 140 may send the secondary ISP a BGP update message containing the customer's allocated prefix 10.1.1.0/24. In response to receiving the customer's allocated IP address range, the secondary ISP typically advertises the customer's route to the backbone network 110. In this way, the secondary ISP notifies network nodes in the backbone network that IP addresses in the customer's allocated range of IP addresses may be reached through the secondary ISP.
Problems often arise in this conventional multi-homed topology. Specifically, at least some BGP-enabled border routers in the backbone network 110 may receive both the aggregated route advertised by the primary ISP and the multi-homed customer site's specific route advertised by the secondary ISP. Because border routers conventionally employ longest prefix-matching algorithms to select the “best paths” for routing network traffic, the border routers will direct the customer site's inbound network traffic through the secondary ISP 130 rather than through the customer's preferred primary ISP 120. In other words, network traffic addressed to a destination IP address in the range of 10.1.1.0/24 will “match” the more-specific route advertised by the secondary ISP instead of the less-specific aggregated route advertised by the primary ISP. Consequently, the primary ISP's intended route aggregation is “broken.” That is, network nodes in the backbone network may have to store more than one BGP table entry for IP address ranges within the aggregated route 10.1.0.0/16, i.e., they may store a first BGP table entry for the aggregated route and a second table entry for the more-specific route 10.1.1.0/24 within the aggregated route. In addition, although the multi-homed customer site 140 can forward its outgoing traffic through the primary ISP 120, as intended, its incoming network traffic will be routed through the secondary ISP 130 due to the conventional longest prefix-matching algorithms in the backbone network 110. This results in an undesired asymmetric network traffic pattern at the customer site 140.
One solution to the above-noted problems has been implemented at the multi-homed customer site 140. According to this solution, border routers in the customer site do not advertise the customer site's allocated range of IP addresses to the secondary ISP 130 if they are aware that the primary ISP 120 is already advertising an aggregated route including the customer site's allocated IP addresses. In this way, the secondary ISP never receives the customer site's set of allocated IP addresses and therefore cannot break the primary ISP's route aggregation. The customer site may become aware of the primary ISP's aggregated route by receiving a BGP update message containing the aggregated route from the primary ISP. Later, if the customer site's border routers lose connectivity with the primary ISP, e.g., due to a failed data link between the customer site and the primary ISP, the customer site's border routers may advertise the customer site's set of allocated IP addresses to the secondary ISP 130. Thereafter, the secondary ISP can advertise the customer site's non-aggregated route (e.g., 10.1.1.0/24) to the backbone network 110 so as to redirect the customer site's incoming network traffic through the secondary ISP. While this solution is effective in the limited case where the customer site 140 loses communication with the primary ISP 120, the solution does not address the situation where the primary ISP 120 loses connectivity with the backbone network 110 yet continues to advertise its aggregated route.
Another possible solution for employing route aggregation in multi-homed networks is described in RFC 1998, entitled An Application of the BGP Community Attribute in Multi-home Routing, by Chen et al., dated August 1996, which is hereby incorporated by reference as though fully set forth herein. This solution associates BGP routes with associated “local preference” attributes, whereby a local preference value indicates a relative preference for selecting a particular address prefix in a BGP best-path computation. This solution also suffers various disadvantages. For instance, all networks and subnetworks need to be configured to understand the predetermined local preference values. Such large-scale configuration is impractical over the Internet, which consists of a large number of independently managed networks and subnetworks. Further, the solution is limited to “square” topologies as described in RFC 1998. Accordingly, the local-preference solution has limited use.
What is therefore needed is a new way of implementing route aggregation in multi-homed topologies without breaking the route aggregation, without requiring special customer-site configuration, and without having to configure a large number of networks and subnetworks. The technique also should minimize asymmetric traffic patterns at a multi-homed customer site.