The present invention relates to communications networks, and in particular, to systems and methods for maximizing the throughput or capacity of broadband network communications.
There is an emerging trend for private and public enterprises to fundamentally change the structure of their Wide Area Networks (WANs). Historically, corporate WANs were constructed with dedicated circuits (aka private lines, leased lines) provided by the telecommunications carriers for the sole use of the corporate enterprise. That is to say, only the corporation's locations were connected by these private circuits and only the corporation's data traffic was transported across the private WAN. Privacy and security were ensured because the circuits were in no way shared with other users outside the corporation. With the proliferation of the Internet worldwide, corporations have begun to realize cost savings and utilize increased bandwidth by migrating from their existing homogeneous private WANs to using the public, heterogeneous network that is the Internet. Using the Internet creates the need to optimize each network connection to obtain maximum throughput and reliability. Private networks have traditionally been built by small number of carriers with interoperable (but often proprietary) standards and similar underlying technology that operates with simple, consistent communications parameters. A private network, once provisioned and operable, is static and requires little further maintenance or tuning. By definition the public Internet is a collection of many different carriers, all using different transport, routing and switching technologies, and a network topology that dynamically evolves over time. The transition to utilizing the public broadband Internet as the infrastructure for a corporate WAN has created the need to monitor, analyze, measure and control the parameters associated with each communications path in order to maintain and maximize network performance.
Referring to FIG. 1, current private circuit corporate networks 10 are mostly built in a traditional hub and spoke topology. Remote computer sites 12 are connected to a main corporate data center 14 through private Frame Relay connections 16, including remote and hub routers 18, 20. A typical corporate data center 14 may include one or more mainframe computers 22 and servers 24 connected to local computer sites 26, and the remote sites 12, through a local area network hub or switch 28. Access by the remote sites 12 to websites 30 on the Internet 32 is often provided by the same frame relay connection 12 to the data center hub 18, and then through a protective firewall device 34 and a router 36. All users at the remote sites 12 wishing to access to the Internet 32 must first traverse the Frame Relay network 16 to reach the single Internet connection at the data center 14. As Internet communications have grown and Internet based applications and services expanded, the resulting traffic on the private Frame network 16 has dramatically increased. Since Frame Relay costs are based on bandwidth needed, this increase in Internet traffic has resulted in companies having to significantly increase the bandwidth of their Frame Relay connections 16 and incur the accompanying costs. Furthermore, the strain on network resources at the corporate data center 14 requires additional financial, human and network resources.
In a private Frame Relay network 16, the communications fabric and equipment is fairly consistent if not identical, and usually under the management of a single telecommunications carrier such as AT&T, Qwest, Sprint or Worldcom. In this topology, each packet of information leaving any remote WAN site or the corporate data center follows the same path using the same protocol and sees a fixed amount of bandwidth available on each leg of its journey from the source to the destination within the WAN. Since only the corporation's data traverses the network, simple traffic management allows each data transmission to use all the available bandwidth on each leg of the network. In this environment, optimizing and tuning of the communications network is simple and unchanging. Once operable, the customer is confident that the configuration at one site can be replicated across all sites to create a robust and reliable network. Since all transmission paths are explicitly defined, the WAN's performance is easily monitored and managed.
The relative simplicity of the homogenous legacy private WAN described above comes at great financial cost and is quite wasteful. Each private circuit costs a fixed amount regardless of the level of usage. Compromises must be struck between average and peak needs on the basis of cost and therefore bottlenecks and collisions invariably arise at times of peak corporate network activity while most of the bandwidth goes unused for the rest of the time.
As a result, corporations are turning to the public broadband, the Internet, as a cheaper, faster way to communicate both among the company's sites and between different companies. Referring to FIG. 2, one example of a public broadband corporate WAN 40 includes remote computer sites 12 connected to a corporate data center 14 directly through the Internet 32. Each remote site 12, depending on the exact type of computer equipment at the site and the type of connection (satellite, cable, phone, etc), may include a variety of network devices 42, such as switches, routers, firewalls, hubs, etc, to enable the connection through the Internet 32 to the corporate data center 14. Although the transition to public broadband corporate WANs has just begun, already many new broadband customers receive less than optimal or even acceptable levels of performance from these new, low cost, high bandwidth solutions. Much of the sub-optimal network performance is largely due to the lack of expertise and experience with networks as diverse and complex as the Internet. Furthermore, previous methods of optimization no longer work because of the unknown and intrinsic variation in the path a data packet takes over the Internet from its source to its destination. Network tuning techniques used on private networks simply fail on the Internet.
In order to use an Internet based WAN, a company creates an internal company extranet or intranet that let authorized users access custom Web pages, reports and forms through the Internet. This method is perhaps the easiest and most cost-effective way to create access; however, while it is possible to configure an extranet to permit direct access of files, they are generally used to serve information as a Web page.
While all of these methods have worked well, and in many cases still do, they suffer from a number of drawbacks including less than optimal speed, less than optimal security, high recurring costs and lengthy amounts of time to deploy. Further, the dependence of companies on e-mail is growing at a rapid rate. The number and size of each e-mail message is also increasing, thus placing importance on the speed and reliability of the connection for the remote user.
In an effort to address some of these issues, a communication method called a Virtual Private Network (VPN) has been utilized. A VPN allows private connections between two machines using any shared or public Internet connection. Referring to FIG. 2, for example, a remote site 12 may include a VPN server 44 that connects through the Internet 32 to a corresponding VPN server 46 at the corporate data center 14. VPNs permit a company to extend connectivity to remote users with the same reliability and security of those attached locally. The need for leased point-to-point links is eliminated because the VPN can function from any Internet connection. The underlying technology behind a VPN has been around for several years, but the wide-scale availability of low-cost, dedicated broadband Internet access such as cable and DSL has companies, large and small, rethinking their remote access strategy.
VPNs are based on a concept called tunneling, a method of encapsulating data into encrypted packets that can travel over IP networks securely and be delivered to a specific address. VPNs are created using one of four possible protocols: Layer 2 Tunneling Protocol (L2TP), Layer 2 forwarding (L2F), Point-to-Point Tunneling Protocol (PPTP) and IP Security Protocol (IPSec). These protocols define methods to create a VPN over many connection types. The VPN was created prior to the availability of cable or DSL Internet access as a means to establish an on-demand private network between a network server and a dial-in remote user.
When dialing-in to any Internet point-of-presence (POP) using the basic 56 kb/s (or slower) modem, the connection is probably made using the Point-to-Point Protocol (PPP). L2TP, L2F and PPTP are VPN protocols that were created primarily to work inside of PPP. These protocols support several authentication methods used in PPP including the Password Authentication Protocol (PAP) and Challenge Handshake Authentication Protocol (CHAP). The L2F protocol adds a two-step authentication process, one from the user and one from the ISP, as well as the ability to create more than a single connection. L2TP enhances and improves upon the security shortcomings of PPTP and L2F through the use of stronger encryption and its support of a multitude of transport methods in addition to PPP. IPSec is currently the leading protocol used in corporate VPNs. The IPSec protocol was created exclusively for use over IP networks, to be used with the emerging IP standard called IPv6. IPSec also uses a host of features that ensure a high degree of security and data integrity.
In the Internet world, packets exchanged between two sites may travel across the Internet over very different paths, traverse numerous different communications protocols and can be processed by a variety of routing and/or switching technologies. While this level of “variety” keeps the cost of broadband Internet access down where the choice of technologies implemented anywhere on the Internet is optimal for the bandwidth and number of connections at a given location, the lack of uniformity vastly increases the complexity of the network topology. The interconnectedness of all the different backbone providers coupled with a multitude of competing/overlapping Internet Service Providers (ISPs) gives the Internet its tremendous dynamic capacity and flexibility, but also ensures that no one can predict the path his data traffic will take between two sites at any given moment. While the Internet Protocol (IP) provides a common standard by which every host communicates, each Internet provider selects different transport protocols and a variety of routing and switching technologies and manufacturers with which they deliver EP-based broadband Internet service. In contrast, in the private Frame Relay network of old, data always traversed the same path, across the same switches at the same locations every time; the network was both simple and predictable.
On the Internet, any time a user opens any Internet application (web browsing-http, email, file transfer-ftp, remote access-telnet, etc.) each data transmission between the source and the destination may be routed differently, because the local network environment at each junction (aka hop) is different at any point in time. Routing decisions are made based on variety of open standard protocols which route each packet based on the relationships defined amongst the local neighborhood of routers (ex. Open Shortest Path First—OSPF, Border Gateway Protocol—BGP, Routing Information Protocol—RIP, Interior Gateway Protocol—IGP, Exterior Gateway Protocol, EGP). If the data packet encounters a switch, then completely different algorithms and methods (ex. Data Layer Switching—DLS or Asynchronous Transfer Mode—ATM Switching) are applied to the processing of the packet.
How then does one define optimum performance for data transmission over the Internet? What is the capacity of the Internet, defined as the largest amount of data transferred in the shortest possible time between a given source and destination? Capacity may also be defined as the product of maximum bandwidth multiplied by the transit time. But, since each hop most likely has a different bandwidth based on the physical medium and transport protocol, which value would one choose? The ideal minimum transit time of a packet traveling from source to destination would be the physical distance traveling multiplied by the intrinsic speed of the transport medium (wire speed for electrons traveling down a copper wire, light speed for photons traveling down an optical fiber). If one assumed that switching and routing at a node happened instantaneously, then to a first approximation this transit time would be a reasonable estimate for a private switched local area network (LAN). Since the path is ill-defined for a routing-based packet-forwarding IP network, such as the Internet, the intrinsic capacity of a public network is very difficult to determine and may not be known.
On the Internet, what are the real causes of bandwidth degradation and delays that prevent a network connection from achieving the ideal capacity that a private circuit WAN could have? Packet loss is one cause of bandwidth degradation, since all time and effort spent to transmit a packet is lost if the packet must be retransmitted. At each network node, the routers and/or switches all have finite on-board computing resources with which to process incoming packets. Too many incoming packets means packets are buffered awaiting processing or, worse, are lost and require retransmission. Further delays are added to the transit time due to router overhead, packet fragmentation, and protocol translation. The finite bandwidth connecting a given node requires that when the amount of incoming traffic exceeds the outbound capacity, then transmission must be throttled to prevent packet loss. Unfortunately, in the public broadband world of the Internet, a priori knowledge of the bandwidth, network node configuration/capacity, etc. that a data packet is going to encounter through its entire route is difficult to determine or cannot be obtained before a packet is sent out for transmission. In contrast, the homogenous, static, switched network environment of the private circuit, Frame Relay WAN is a known, quantifiable, stable network environment that a user's data would encounter every time.
Given the “black box” nature of the public broadband Internet, today, then it is unlikely that there is a mathematical formula or empirically derived solution to the problem of network optimization. In fact, that is the case today, since network optimization is a manual process performed by a skilled communications engineer, only at the carrier or EP backbone level, where efficiencies on the highest capacity sections of the Internet offer the greatest rewards in increased capacity without additional capital investment. Network optimization in this form is often referred to as Traffic Engineering and is mostly performed by the Network Engineers on the backbone providers and ISPs. But without some type of optimization of the user's broadband connection, the user at the edge of the Internet can never fully utilize the capacity of the public broadband network that constitutes his connection to the WAN/Internet. Maximization of the transmission capacity from a location on the edge of a network requires a heuristic solution for the optimum configuration of communications parameters based on no knowledge of the inner workings of the Internet “black box” connecting the source and destination.
A public broadband connection typically provides very high speeds for WAN services at a lower cost compared to a private circuit connection. The ability to use a large amount of bandwidth when available at a low cost is compelling. However, there are shortcomings to public broadband connectivity that private circuit WANs avoid. First, the user must share the connection in some fashion with his fellow subscribers. In the case of xDSL, a group of local users must share the bandwidth coming out of the ISP's first point of presence (POP), where that group of DSL circuits is first consolidated. In the case of cable broadband, a group of users actually share a physical connection (ex. a coaxial cable running down the neighborhood street for cable TV and data). Fortunately, most Internet traffic is sporadic, random and asynchronous so many users can share a finite amount of bandwidth and have access to most of the maximum bandwidth for the duration of their session. Second, the user's data packets encounter an unknown and varying configuration of routing equipment that is used throughout the public broadband network. Not only are there multiple technologies (ex. xDSL, Satellite, Cable) available to connect to the Internet, but there are a large number of ISPs providing broadband services. Furthermore, each ISP is free to choose from another a large group of router and switch technology equipment manufacturers for the purposes of building/standardizing their own network infrastructure which the ISP then configures, maintains, updates and upgrades according to its own strategy and needs of its customers.
The user's low cost of broadband connectivity comes at the expense of thin profit margins for carriers or ISPs, which leaves few resources available to implement new routing technologies, much less upgrade existing technology. The outcome of this network environment is a competitive and incremental diversification of overlapping, but interconnected networks resulting in a broadband Internet that can only be described as a dynamic collection of transmission media and network node technologies. Contrastingly, in an expensive, private WAN environment, customers can feel comfortable that the equipment is uniformly maintained and upgraded by their chosen single carrier.
As discussed above, the inner workings of the public broadband, or Internet, may be viewed as a black box. A data packet may take any one of a plurality of routes through the Internet to get from a source computer to a destination server.
As an example, referring to FIG. 3, consider the physical path 50 of a data transmission 52, such as a 1500 byte frame, as it traverses the Internet 32 from its source computer 54 to a destination server 56. The user opens an application on the source computer 54 to initiate a network session. The source computer 54 then processes the data frame down its TCP/IP stack, adding the header data and sends the frame out the Ethernet adapter card, across a 10/100 bT cable over the LAN to the local router 58. This router receives this IP packet 52 from its Ethernet interface (eth1), which is physically connected to the source computer 54 via an Ethernet cable and the LAN switch. After the packet 52 enters eth1, the router 58 checks the frame for data integrity. The frame 52 is stored in the receive buffer on the router 58. The frame header is removed and only the data payload remains at the link layer. The router's forwarding engine sends the data to the router's other network interface eth2; the router 58 re-encapsulates the packet with a new link header with the destination address of the next router to receive the frame. The data part of the packet gets a new IP header with a new TTL, fragmentation offset, header checksum, source and destination address. The 1500 byte frame 52 leaves from the second interface eth2 towards the router at Local Telco 160.
The router at Local Telco 160 receives the frame on its interface eth0. Unfortunately, this router has a Maximum Transmission Unit (MTU) set at 1480 bytes, which means the incoming 1500 byte frame is too big for this router to process intact. This router receives the frame, strips off the header and breaks the frame up in to two parts (fragments), so that both frames (header+data) are less than 1480 bytes in size. Both frames then follow the same general routing process as described above. The forwarding engine sends the two packets to the correct outbound interface to the next destination router at Local ISP 162. If the next router requires even smaller frame sizes then it fragments the larger packet into smaller acceptable packets. It is noteworthy in this process that routers typically do not de-fragment data frames. The data is typically only reassembled after all the data frames have been received and ordered at the destination computer. In other words, in a typical example, fragmentation is a one-way street to network performance degradation.
Once the packets reach the Internet backbone 64, which is typically based on ATM switching over optical fiber (OC-12 between Carrier A 66 and Carrier B 68), each frame is multiplexed into 56 byte packets that are transmitted in parallel over multiple channels. After traversing any number of ATM switches, the packets are ultimately reassembled into frames of a default size determined by the parameters of the convergence sub-layer of the last downstream ATM switch at Carrier C 70. As the frames then traverse a network path, they are again subjected to the same IP routing as described above until they reach their destination 56 while running same risk of incurring fragmentation, delay and packet loss at each router along the way.
Most of the optimization work that is done today takes place at the time a new network connection is established or when additional network devices are added, if at all. Today, most equipment is taken out of the box, plugged in, tested for a connection and left. There are simply no tools to help optimize the WAN connection being used. Furthermore, referring to FIG. 4, different vendors supply different elements of the customer premise networking solution (often consisting of a router 72, firewall 74 and VPN server 76), install his portion of the transmission chain, perhaps optimize that component's performance based on internal measurements, declare success and leave. Furthermore, contiguous network optimization often cannot take place since the configuration of the different network devices compete with each other to set many of the critical network parameters. Often a compromise solution is reached just to get all three elements to work with each other at the end user's site. Often, the first or last device in the chain then dictates the network parameters for the data session, which compromises the performance of the other devices.
There are numerous disadvantages to this operational model. First, communications parameters for the whole transmission chain are never fully optimized at the start. Second, the parameters are never adjusted on a periodic or on-going basis to accommodate changes in the local Internet environment that affect network performance. Without analysis and optimization of key communication parameters, the available bandwidth is reduced by packet losses, fragmentation and partially empty data frames along the transmission path.
Because the migration to broadband WAN networks is a fairly recent phenomenon, the existing technology providers of the network infrastructure, such as the router, firewall and VPN engine manufacturers, do not presently provide the tools and flexibility in their products to operate in this new environment. The migration from a private circuit world to that of the public broadband Internet has monumental implications for not only the device manufacturers, but for the telecommunications providers of bandwidth and circuitry (aka the network carriers) as well. The carriers must evolve to better support the shared broadband network paradigm. In the past, telecom carriers managed their network from the inside looking outward. In other words, the carriers focus on bandwidth utilization, traffic engineering, and quality of service at the core of their network, with diminishing resources being devoted to areas far removed from the high bandwidth backbone. This was an appropriate allocation of financial and technical resources, since the private circuits on the edge of the network were not heavily utilized (single user, static configuration) and required little attention once installed and operational. Furthermore, in the past, the data traffic patterns of private circuit networks changed slowly over time, since each corporate network had its own circuit infrastructure and the backbone of the network would not experience dramatic changes in the amount or timing of peak network activity. Also, increased network traffic could be anticipated and planned for when an addition of a new corporate WAN was going to be added to a carrier's network or when significant changes to existing private WAN circuit configurations were scheduled to take place.
In the new paradigm of a shared, public broadband Internet, users compete for the available bandwidth when they initiate a data session, and can only utilize what is available for the duration of the session. In contrast, in the old private circuit world, there was a dedicated circuit with a known amount of capacity available for use at all times. In the public broadband configuration, both the user and the provider are now always operating in a dynamic network environment, as compared to the relatively static configuration of a private circuit WAN.
Unfortunately for the carriers, the new public broadband Internet has vastly increased the number of users, while drastically reducing the revenue associated with each user. With each user accepting whatever bandwidth is available at a given moment, carriers cannot charge premium prices for dedicated circuits and/or service level guarantees. Thus, right now, there is a need to maximize transmission capacity for an end user at each end of a broadband communications link, and there is a need for this optimization to occur as near real time as possible.