The Border Gateway Protocol (BGP) is the protocol that makes connectivity possible between the autonomous networks that collectively form the Internet. Each autonomous network, denoted an Autonomous System (AS), is itself a collection of physical networks under the same administrative entity. Each AS is identified by a unique Autonomous System Number (ASN). To achieve connectivity among them, AS's exchange routing information using BGP.
A BGP route contains reachability information about a section of the IP space, called a prefix. An AS that originates a prefix communicates the route to this prefix to its neighboring AS's. As the route travels from one AS to the next, each AS appends its ASN to the path. This set of ASNs is called an AS path.
Neighboring AS's exchange a vector of AS paths. Each AS then selects the AS path with the shortest length to any given AS from all the vectors received. Loop-free connectivity is achieved by having BGP devices discard routes that include the ASN of their own AS in the AS path.
BGPs decision-making process also includes a set of rules used to articulate policies that can override the selection of the path with the shortest length. The set of rules also include a number of tie-breaking rules to be used in case no one winning route has been selected using either policy or shortest AS path rules.
Transit providers are ASs that provide transit services to other ASs. They typically have default-free connectivity; that is, they are capable of reaching any prefix in the Internet. Internet Service Providers (ISPs) are usually transit providers whereas enterprises own edge networks and obtain their connectivity service from ISPs. Examples of ISPs include AT&T, Sprint, Earthlink, Level3, UUNET, and Qwest, to name a few.
An ISP is responsible for delivering traffic to and from its customers from any source and to any destination on the Internet. It is possible that an ISP does not have a direct connection to such a destination, in which case it hands the traffic to another ISP. Given that carrying traffic incurs costs, ISPs use BGP's policy rules to implement complex peering agreements among them to share these costs.
Enterprises also use BGP configuration to influence how traffic arrives and leaves their edge networks. One common technique used by multi-homed enterprises is the AS path prepending, where an edge network inserts its own ASN in the AS path of a route, thereby generating routes with different AS path lengths and affecting the distribution of inbound traffic access across its access links. Conversely, such enterprises can also apply filters to routes to particular destinations that they receive from their providers, thereby affecting the distribution of outbound traffic across these same access links.
BGP was designed to provide connectivity but does not take actual performance into account, resulting in it being essentially unaware of the many phenomena that result in performance degradation. ISPs lose control over the performance of a given flow as soon as they hand the flow to another ISP. Therefore, given that no ISP is directly connected to all possible destinations, no ISP can offer Internet-wide end-to-end guarantees to its customers. For inter-ISP routing, the use of BGPs is based on policy rules, and those rules are often motivated by economic considerations. As a result, an ISP's contractual agreement with its customers is typically limited to maintaining good performance within their network. In the end, enterprise customers cannot obtain guarantees on what they care the most about, namely end-to-end performance for their mission-critical applications.
Performance degradation can have a significant detrimental impact on business productivity, particularly as organizations rely more on the Internet for business communications. Examples of important business communications include Voice over IP (VoIP), video over IP, and session establishment via the Session Initiation Protocol (SIP). VoIP users are not willing to tolerate performance degradation in bearer traffic, such as a few seconds of “dead air” in a live conversation. In contact centers, a warm restart of an IPSI board occurs when there is no successful heartbeat communication between the controlling application and board for longer than 3 seconds, i.e., if performance degradation affects the signaling traffic. Contact center agents are forced to reconnect after a warm restart, losing their existing sessions with customers.
While some BGP configuration techniques can be and are being used, in theory, to improve performance, in their communications with destinations many ASs away, enterprises are still left with no end-to-end performance guarantees and can experience transient performance degradation, particularly in MultiProtocol Label Switching (MPLS)-Virtual Private Networks (VPNs) (which are becoming the de-facto standard for enterprise communications). Such degradation is primarily due to delays incurred by BGP after the routing information towards a destination changes, and a new routing path needs to be chosen. BGP and other routing protocols are not directly connected to the network fabric and unable to detect rapidly connectivity failures. Performance degradation can be significant enough that it renders the application on the network unusable even though connectivity is maintained. This phenomenon, where connectivity is available but the application can not function, is typically referred to as a “brownout”. Brownouts are a significant problem for applications that BGP cannot address given that BGP only reacts to total loss of connectivity.
Network operators can use BGP filtering mechanisms to statically route traffic through the best performing path, assuming that the performance of that path is known. Edge networks have little control over the path that their traffic takes across the Internet. It is only possible for them to select how to exit the edge network (i.e., the first hop), and this selection is determined typically by complicated policy rules in BGP. In practice, however, these techniques are seldom used given that they need to be implemented manually, require significant operational expertise, and require constant maintenance given that performance changes are common.
Finally, BGPs reaction times can typically be slow, even in the event of connectivity loss. When a previously reachable destination becomes unreachable, a phenomenon known as a blackout, BGP detects the problem and selects an alternative route to that destination. However, studies show that BGP can, in many cases, take a significant amount of time to converge to a new route, with unpredictable performance consequences during the convergence period.
Path optimization technology exploits path diversity in a communications network by detecting performance and/or connectivity problems and rerouting around them in real time. However, current path optimization technology is not usually configured with the measurement rates and timeout-related parameters necessary to detect confidently these events in the time scale necessary to prevent active users from perceiving the degradation. In addition, current path optimization technology cannot control the return path of the measurements that follow the path selected by default routing protocols such as BGP, which may lead to the inability to react in time to certain network problems, such a bidirectional line cut.