1. Field of the Invention
The present invention relates to computer networks and more particularly to the control and management of prefixes in a computer network.
2. Background Information
A computer network is a geographically distributed collection of nodes interconnected by communication links and segments for transporting data between end nodes, such as personal computers and workstations (“hosts”). Many types of networks are available, with the types ranging from local area networks (LANs) to wide area networks (WANs). LANs typically connect the nodes over dedicated private communications links located in the same general physical location, such as a building or campus. WANs, on the other hand, typically connect geographically dispersed nodes over long-distance communications links, such as common carrier telephone lines, optical lightpaths, synchronous optical networks (SONET), or synchronous digital hierarchy (SDH) links. The Internet is an example of a WAN that connects disparate networks throughout the world, providing global communication between nodes on various networks. The nodes typically communicate over the network by exchanging discrete frames or packets of data according to predefined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP). In this context, a protocol consists of a set of rules defining how the nodes interact with each other. Computer networks may be further interconnected by an intermediate network node, such as a router, to extend the effective “size” of each network.
Since management of interconnected computer networks can prove burdensome, smaller groups of computer networks may be maintained as routing domains or autonomous systems. The networks within an autonomous system (AS) are typically coupled together by conventional “intradomain” routers configured to execute intradomain routing protocols, and are generally subject to a common authority. To improve routing scalability, a service provider (e.g., an ISP) may divide an AS into multiple “areas.” It may be desirable, however, to increase the number of nodes capable of exchanging data; in this case, interdomain routers executing interdomain routing protocols are used to interconnect nodes of the various ASes. Moreover, it may be desirable to interconnect various ASes that operate under different administrative domains. As used herein, an AS or an area is generally referred to as a “domain,” and a node that interconnects different domains together is generally referred to as a “border node” or “border router.” In general, the autonomous system may be an enterprise network, a service provider or any other network or subnetwork. Furthermore, the autonomous system may be multi-homed, i.e., comprising a plurality of different peer (neighboring) connections to one or more other routing domains or autonomous systems.
The administrative entity of an AS typically configures network nodes within the AS to route packets using predetermined intradomain routing protocols, or interior gateway protocols (IGPs), such as conventional link-state protocols and distance-vector protocols. These IGPs define the manner with which routing information and network-topology information is exchanged and processed in the AS. Examples of link-state and distance-vectors protocols known in the art are described in Sections 12.1-12.3 of the reference book entitled Interconnections, Second Edition, by Radia Perlman, published January 2000, which is hereby incorporated by reference as though fully set forth herein.
Link-state protocols, such as the Open Shortest Path First (OSPF) protocol, use cost-based routing metrics to determine how data packets are routed in an AS. As understood in the art, a relative cost value may be associated with a network node to determine the relative ease/burden of communicating with that node. For instance, the cost value may be measured in terms of the average time for a data packet to reach the node, the amount of available bandwidth over a communication link coupled to the node, the monetary cost per amount of bandwidth, etc. Network nodes in the AS generate a set of cost values associated with their neighboring nodes. Each set of cost values is then “advertised” (flooded) to the other interconnected nodes. Using the advertised cost values, each node can generate a consistent “view” of the network topology, thereby enabling the nodes to determine lowest-cost routes within the AS.
Distance-vector protocols, such as the Interior Gateway Routing Protocol (IGRP) or Routing Information Protocol (RIP), use distance-based routing metrics to determine how data packets are routed in an AS. A network node may associate a distance metric with each of its interconnected nodes in the AS. For example, the distance metric may be based on, e.g., a number of hops between a pair of nodes or an actual distance separating the nodes. Operationally, the network nodes determine distances to reachable nodes in the AS and communicate these distance metrics to their neighboring nodes. Each neighboring node augments the received set of distance metrics with its own distance measurements and forwards the augmented set of metrics to its neighbors. This process is continued until each node receives a consistent view of the network topology.
An intermediate network node often stores its routing information in a routing table maintained and managed by a routing information base (RIB). The routing table is a searchable data structure in which network addresses are mapped to their associated routing information. However, those skilled in the art will understand that the routing table need not be organized as a table, and alternatively may be another type of searchable data structure. Although the intermediate network node's routing table may be configured with a predetermined set of routing information, the node also may dynamically acquire (“learn”) network routing information as it sends and receives data packets. When a packet is received at the intermediate network node, the packet's destination address may be used to identify a routing table entry containing routing information associated with the received packet. Among other things, the packet's routing information indicates the packet's next-hop address.
A plurality of interconnected ASes may be configured to exchange routing and reachability information among neighboring interdomain routers of the systems in accordance with a predetermined external gateway protocol, such as the Border Gateway Protocol (BGP). The BGP protocol is well known and generally described in Request for Comments (RFC) 1771, entitled A Border Gateway Protocol 4 (BGP-4), published March 1995, which is hereby incorporated by reference in its entirety. An adjacency is a relationship formed between selected neighboring (peer) routers for the purpose of exchanging routing information messages and abstracting the network topology. The routing information exchanged by BGP peer routers typically includes destination address prefixes, i.e., the portions of destination addresses used by the routing protocol to render routing (“next hop”) decisions. Examples of such destination addresses include IP version 4 (IPv4) and version 6 (IPv6) addresses. BGP generally operates over a reliable transport protocol, such as TCP, to establish a TCP connection/session. To implement the BGP protocol, each AS includes at least one border node through which it communicates with other, interconnected ASes. Because data packets enter and exit the AS through the border node, the border node is said to be located at the “edge” of the AS.
The BGP protocol generally facilitates policy-based routing in which an administrative entity places restrictions on inter-AS routing operations. For example, the administrator of a company's AS may employ a BGP routing policy where network traffic leaving the AS is not permitted to enter a competitor's network, even if the competitor provides an otherwise acceptable routing path. BGP policies typically do not depend on the cost-based or distance-based routing metrics used with interior gateway protocols. Instead, the BGP policies rely on AS path-vector information. More specifically, the BGP protocol enables a plurality of interconnected ASes to exchange network topology information. Using this topology information, each AS can derive “paths” to the other reachable ASes, each path defining a logical sequence of ASes. For example, a path between an AS1 and an AS3 may be represented by the sequence {AS1, AS2, AS3} when only AS2 intervenes. Based on the content of these AS sequences, the BGP protocol may filter those paths that do not coincide with the administrator's policies. As such, inter-AS routing operations are performed using only the “best paths” that satisfy the BGP policies.
Because BGP policies are applied to sequences of ASes, the policies are not able to optimize inter-AS routing in other respects, such as optimizing bandwidth utilization or minimizing cost or distance metrics. Furthermore, interior gateway protocols cannot remedy these deficiencies in the BGP protocol because they do not scale well when applied to a large number of network nodes spanning multiple ASes. For instance, the process of exchanging cost-based or distance-based routing metrics among a large number of network nodes would not only consume an unreasonable amount of network bandwidth, but also would consume an unacceptable amount of processing resources for processing those metrics to generate a convergent view of the network topology.
To address the limitations of conventional routing protocols, network administrators sometimes implement additional optimizations to improve network performance. For example, a load-balancing or cost-minimizing procedure may be used in conjunction with traditional routing protocols to redistribute data flows entering or exiting a multi-homed routing domain or AS. In some networks, border nodes located at edges of ASes, e.g., between an enterprise network and one or more Internet Service Providers (ISPs), may be configured as Optimized Edge Routers (OERs). Here each OER may be configured to periodically select an Optimal Exit Link (OEL) to each ISP for a given destination prefix (a monitored and/or controlled prefix) based on performance, load, cost, and service level agreements (SLAs) associated with connections to the ISP. Notably, a prefix, as defined generally herein, refers to a subset of nodes within the computer network. Ultimately, the end result for the enterprise network is improved Internet performance, better load distribution, and/or lower costs for Internet connections. These additional procedures may require the border nodes (OERs) to collect various network statistics associated with the data flows. An exemplary software application that may be used to collect the network statistics at the border nodes is NetFlow™ by Cisco Systems, Incorporated, which is described in more detail in the technical paper entitled Netflow Services Solutions Guide, published September 2002, and is hereby incorporated by reference as though fully set forth herein.
Techniques that may be used to select the OEL for the monitored prefix include passive monitoring and/or active probing. Passive monitoring relies on gathering information from OERs learned from monitoring conventional user traffic, such as throughput, timing, latency, packet loss, reachability, etc. For example, selected interfaces at one or more network nodes monitor incoming and outgoing data flows and collect various statistics for the monitored flows. Notably, interfaces may include physical interfaces, such as a port on a network interface card, and/or logical interfaces, such as virtual private networks (VPN) implemented over multiple physical interfaces. Each node stores address prefixes and statistics for the monitored data flows, which may be periodically exported to a central management node (e.g., a “collector” or “Master”). The central management node is configured to receive prefixes and statistics (e.g., for those prefixes) from a plurality of different network nodes. A record format that may be used to export the raw prefixes and statistics is described in the technical paper entitled Netflow v9 Export Format, which is hereby incorporated by reference in its entirety. Further, a more sophisticated interaction (e.g., a filtered and/or pre-processed information exchange) between border nodes and a Master node is described in commonly owned copending U.S. patent application Ser. No. 10/980,550, now issued as U.S. Pat. No. 8,073,968, entitled METHOD AND APPARATUS FOR AUTOMATICALLY OPTIMIZING ROUTING OPERATIONS AT THE EDGE OF A NETWORK, filed by Shah et al. on Nov. 3, 2004, the contents of which are hereby incorporated in its entirety.
Active probing, on the other hand, relies on probe packets to measure various performance parameters associated with accessing the monitored prefix from an originating node (source). Here, the originating node may generate multiple probe packets that are then forwarded via different exit interfaces (e.g., data links) on different paths to target nodes (targets) in the monitored (destination) prefix. Upon receiving the probe packets, the targets respond to the originating node, e.g., with return packets or other known probe responses. The originating node may eventually acquire the responses and use them to measure various parameters, such as delay, loss, jitter, and reachability, etc., associated with accessing the destination prefix via the different links.
Once the relevant statistics are obtained (e.g., at the central management node), the collected parametric (performance) information (i.e., learned from passive monitoring or active probing) is analyzed, such as either manually by a network administrator or dynamically by a software script. The analyzed information may then be used to select an OEL from among the different exits that may be used to reach the destination prefix, and/or to determine whether the data flows may be more optimally distributed. For instance, suppose an administrator desires to make more efficient use of available network bandwidth and determines that a first network interface is under-utilized and a second interface is oversubscribed. In this case, at least some data flows at the second interface may be redirected to the first interface. To effectuate such a routing change, the administrator (or software process) may, for example, make static changes to the routing tables at the first and second interfaces or may re-assign local-preference values (or other priority values) associated with the data flows.
The selection of an OEL or best path (e.g., for a particular prefix) is generally based on one or more policies. As defined herein, a policy is any defined rule that determines the use of resources within the network. A policy may be based on a user, a device, a subnetwork, a network, or an application. For example, a router may be configured with a policy defined to route traffic destined for a particular prefix over a best path having the shortest hop count to the prefix. Alternatively, the policy may be defined to route traffic from a type of application over a best path based on the shortest delay or round trip time (RTT). Those skilled in the art will understand that other policies may be defined, such as, e.g., reachability, lowest packet loss, best mean opinion score (MOS), which provides a numerical measure of the quality of human speech at the destination end of the circuit (e.g., for Voice over IP, or VoIP), bandwidth, utilization, etc. Also, policies may be defined to select a best exit based on cost. For example, a cost minimization policy technique is described in commonly-owned copending U.S. patent application Ser. No. 10/631,682, now issued as U.S. Pat. No. 7,257,560, entitled COST MINIMIZATION OF SERVICES PROVIDED BY MULTIPLE SERVICE PROVIDERS, filed by Jacobs et al. on Jul. 31, 2003, the contents of which are hereby incorporated in its entirety.
In addition to defining rules used to select a best path, however, policies may also be defined to govern performance characteristics for a particular prefix. Once a best path has been selected, it is important to verify that the path maintains acceptable performance characteristics, and that the current path is still, in fact, the best path. For instance, while a certain performance characteristic for a particular prefix conforms to the defined policy (i.e., over the current path), the prefix is considered to be “in-policy,” and traffic remains on the current (best) path. These policies often take the form of an upper (or lower) threshold on a particular performance characteristic that should not be surpassed. For example, in the case of voice traffic (e.g., voice over IP, or VoIP), a policy may be defined indicating that the RTT should be less than 50 milliseconds (ms). If the measured RTT is, e.g., 40 ms, the prefix is considered to be in-policy. In the event, however, the performance characteristic for a particular prefix does not conform to the defined policy (e.g., 60 ms), the prefix is considered to be “out-of-policy” (OOP), and the node may be required to select an alternate path.
One example policy that may be applied to links (or paths) is a link utilization threshold policy, as described in commonly-owned copending U.S. patent application Ser. No. 11/337,217, entitled LINK POLICY ROUTING BASED ON LINK UTILIZATION, filed by Patel et al. on even date herewith, the contents of which are hereby incorporated in its entirety. A link utilization threshold policy (“link policy”) may be used to define a threshold on the amount of traffic (traffic load) one or more links may carry. Also, a link policy may define a range among a plurality of links, where each of the links must maintain a traffic load that is within a certain percentage of the traffic load for the other links (e.g., for load balancing).
Generally, links are fixed, in that they have a certain capability (e.g., bandwidth, cost, delay, etc.) and connectivity (e.g., physical connection from a first node to a second node). Because of this, when a link goes OOP, one available solution is to redirect traffic traversing that OOP link onto one or more other links that are currently in-policy. Once certain traffic, e.g., to one or more prefixes, has been redirected from the OOP link, the link may become in-policy with less traffic. One problem associated with redirecting prefixes between links is that occasionally the available prefixes are larger (e.g., utilize more bandwidth) than necessary to effectively (optimally) redirect traffic from a link to bring it in-policy. For instance, 10 Kilobytes per second (KBps) of traffic may need to be redirected from a link for policy reasons, but the smallest existing prefix in the routing table may utilize 20 KBps. Redirecting the entire 20 KBps prefix may be inefficient and/or sub-optimal when only 10 KBps needs to be redirected.
In addition to policy-based reasons, system administrators may desire more granular prefix control for other purposes, as will be understood by those skilled in the art. Typically, the system administrator may manually configure smaller prefixes, but this process may be cumbersome and prone to errors. These smaller prefixes may not be configured optimally, such as based on real time traffic characteristics, but instead may be configured based on arbitrary boundaries decided by the system administrator, such as a number of nodes within the prefix. For instance, certain nodes within a prefix may generate more traffic utilization than other nodes, but those nodes may all fall within one of the smaller manually configured prefixes, thus not providing the “granular” control desired.