1. Field of the Invention
This invention relates to the field of networking. In particular, the invention relates to measurements of network performance and optimization of network routing.
2. Description of the Related Art
The Internet today is comprised of groups of networks each run independently. Networks that build their own routing view of the Internet by connecting to multiple Internet Service Providers (ISPs) are Autonomous Systems (AS) and are assigned a unique 16 bit AS number. These networks exchange routing information about their Internet connectivity using BGP4.
As the Internet has grown, service providers have stratified into several categories. The trend has been moving toward 3 main categories: National/International Tier One ISP, Content Provider/Content Hoster and User Access Providers. This specialization has shown some weaknesses with the current system of distributed Internet decision-making.
National/International Tier One Internet Service Providers (ISPs)
As the Internet has evolved, there are fewer numbers of large Tier One providers able to compete by rebuilding their networks with newer high-speed routers, transmission gear, and access to dark fiber. As such, the difference between the top providers has decreased and the view of the Internet that is passed to multi-homed customers is very similar. Many routing decisions result from breaking ties or by enforcement of routing policy by the customers, who attempt to balance load across available capacity. This results in decision making by end customers based on information that is becoming less and less differentiated and does not factor in the performance of the available paths.
The current state of Tier 1 ISPs evinces a need for an automated process for generating routing decisions for such ISPs, based on up-to-date information on the performance of alternative paths.
Content Providers/Content Hosters
Content Providers need to multi-home in order to provide reliable high quality access to the Internet. As it is very difficult and expensive to connect directly with the User Access Providers, connectivity to deliver content needs to be built by connecting to the large Tier One providers as well as by making localized policy decisions. As such, there is almost no coordination between Content Providers and their customers who get their connectivity through User Access Providers.
Content Providers also have the problem of delivering the majority of traffic from their site to their user base. They need to make the decision of which of their directly connected ISPs can best deliver traffic for any particular user group. These decisions are usually made by allowing BGP to choose the “best route” (shortest number of network hops) and subsequently applying a local policy to hand tune the selections for particular destinations of interest. This, however, is not an automated process. Destinations with large/important user groups are forced over paths that seem to provide better service proactively, and customers that complain are examined to see if a switch in paths could provide better service reactively. Issues of local outbound capacity with particular ISP connections often require traffic to be shifted off of even BGP “best route” paths, regardless of performance, in order to ease local congestion.
As such, there is a need for an automated process for selecting best routes for Content Providers to connect to particular user groups.
User Access Providers
User Access Providers need to manage inbound traffic to their user base from Content Providers. Today there are very poor mechanisms to control inbound traffic. Several mechanisms that are in use today are: Traffic Engineering (upgrading and purchasing new links from the “correct” ISPs); BGP padding to make a particular inbound path look bad to the entire Internet; and specific advertisements of small portions of a User Access Providers address space out different ISP links.
Each one of these methods has problematic operational issues. Traffic engineering requires an analysis at a particular time to select what ISPs should be used for future best cost and performance access to the Internet. Many times ISP service availability and local circuit installation require very long lead times. Service availability can often take 6 months or more, and contract terms are typically 1–3 years. By the time an ordered ISP service is available, the initial analysis may no longer be valid. Traffic flows may have changed and the performance of the ISP may have changed significantly.
BGP padding is a method of attempting to influence traffic flows in the Internet by artificially making a path look longer (more AS network hops). The User Access Provider will advertise its own connectivity to an ISP by appending its AS number multiple times in the BGP path. Networks making BGP “best route” decisions will see this path as lengthy and will be more likely to choose another available, shorter, path. This has the effect of reducing inbound traffic to the User Access Provider over the “padded” path. There are, however, several difficulties with this method. For instance, because the “padded” path is communicated to the entire Internet, there is no way to communicate a desired inbound traffic policy to a particular traffic source. This causes significant amounts of traffic to be shifted away from the padded path (i.e., BGP padding allows very little granularity in how much traffic can be influenced to change paths). The results become even less granular as the number of Tier One ISPs decrease and become less differentiated. There is also no way of communicating why the change is being requested and no way to take performance into account when a traffic source using BGP “best route” decisions receives a longer AS Path. As a normal operational procedure, a User Access Provider will “pad” a particular path and observe an initial shift in traffic. This initial traffic change may not be permanent. It may require several days as other networks and traffic sources adjust their policies manually reacting to the change in traffic flows.
Another mechanism used by User Access Providers to influence inbound traffic is to use more specific IP route advertisements of their total address space. The Internet has incorporated CIDR (Classless Inter Domain Routing) into both its routing and forwarding decision-making. CIDR is a mechanism that allows multiple routes that are viable for a particular destination to be present within the Internet. The path that is selected to forward traffic is the route with the more specific match of the destination IP address (longest match).
An example of this type of inbound policy is a User Access Provider that has 2 links, Link 1 and Link 2, each of which communicates with a different ISP, ISP1 and ISP2, respectively. Ordinarily, the advertisements to ISP1 and ISP2 are identical. However, if there is more traffic than can be handled inbound on Link1 associated with ISP1 and there is available capacity on Link2 associated with ISP2, an inbound policy needs to be implemented to shift some traffic. Often a BGP Pad policy to artificially increase the network distance associated with ISP1 will cause a significant amount of traffic to shift to ISP2 and Link2. This may be more traffic than Link2 can carry and require the policy to be removed. To get finer granularity for the amount of traffic that is shifted, a more specific route advertisement is added to the ISP2 advertisement. This will cause traffic for a subset of the User Access Provider's customers to prefer ISP2 and Link2 inbound from the Internet.
Although more control can be achieved over inbound traffic using this approach, it causes several problems. Management of the infrastructure is complicated since different groups of customers will have different performance and paths because of the fragmented policy. It also increases the global Internet route table size by requiring extra routes to be carried by external networks to implement inbound policies. Many network infrastructures will ignore specific route advertisements that are “too small”. Currently “too small” is an advertisement of a network route capable of addressing 4,096 hosts (/20 CIDR route). As such, this method will not provide fine granular control for providers with small amounts of address space. Additionally, as is the case with all the inbound solutions, end to end performance is not able to be taken into account when shifting some flows from Link1 to Link2.
Looking Glass
In typical networks, in which routing paths are communicated between Autonomous Systems via BGP, the information about which outbound path has been chosen from among the available paths is not communicated. Although this information is very useful for destination networks to know and act on, there is no mechanism or concept for the exchange of the resulting decisions. A troubleshooting tool that has been deployed by some networks to permit visibility into local routing decisions is called a Looking Glass (LG). The implementation of a LG is most often a WWW based user interface that has a programmatic back end and can run a small number of queries on one of the networks BGP routers. The deployment by networks of LG's is an example of the usefulness of the information. However, though an LG gives information about what path has been chosen to a particular destination by the network that deployed the LG, and the LG gives no information as to the performance or reason behind choosing a non BGP “best route” path.